12.07.2015 Views

Name: cs598dhp Parallel Processing Midterm Exam Due in ... - Polaris

Name: cs598dhp Parallel Processing Midterm Exam Due in ... - Polaris

Name: cs598dhp Parallel Processing Midterm Exam Due in ... - Polaris

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>cs598dhp</strong><strong>Parallel</strong> <strong>Process<strong>in</strong>g</strong><strong>Midterm</strong> <strong>Exam</strong><strong>Due</strong> <strong>in</strong> class Tuesday, May 4, 2008(11 pages / 8 questions total)<strong>Name</strong>:____________________________1. Short questions [5 pts.](a) A way to characterize locality of a program is the space-time product. F<strong>in</strong>d the def<strong>in</strong>itionof space time product and state it. How is it related to the notion of efficiency for parallelprograms? In other words, what figure of merit tends to improve when space-time product ism<strong>in</strong>imized or efficiency is maximized?1


(b) List the essential features of the Von Neumann computational model2


2. Pag<strong>in</strong>g [5 pts.] Consider the loop below where array A is assumed stored bycolumns. Assume a computer with n page frames available <strong>in</strong> ma<strong>in</strong> memory. Eachpage is 512 bytes long. Assume also that the replacement strategy is Least RecentlyUsed (LRU). F<strong>in</strong>ally, assume that no page from A or B is <strong>in</strong> ma<strong>in</strong> memory when theloop starts execut<strong>in</strong>g. If n = 1, 2 and 3, state how many page faults will take place?For what value of n will the number of page faults will be lower than the number ofpage faults when n=3? For the purposes of this problem, you can ignore all referencesother than those to A and B.real A(128,128),B(128,128)do i=1,128do j=1,128A(i,j)=B(i,j) + A(i,j)B(i,j)=A(i,j)+1end doend do3


3. Locality enhancement [5 pts.] Consider the loop below.(a) How would you transform it to enhance locality?(b) For large values of n, what fraction of cache misses do your improved versionsaves relative to the orig<strong>in</strong>al version?do i=1,ndo j=1,nB(i,j)=A(i,j+1)+A(i-1,j)+A(i,j-1)+A(i+1,j)end doend do4


4. SIMD programs [5 pts.] Consider the programdo i=1,na(i) = b(i) + c(i)s = s + a(i)end do(a) Translate this program <strong>in</strong>to a vector version. Use only triplet notation. Do not use<strong>in</strong>tr<strong>in</strong>sic functions to represent operations.(b) Compute the speedup, efficiency, and redundancy on an array mach<strong>in</strong>e assum<strong>in</strong>gn/4 PEs. Assume all float<strong>in</strong>g po<strong>in</strong>t operations take one unit of time and ignore allother operations <strong>in</strong>clud<strong>in</strong>g those to control the loop and those needed forcommunication.5


5. SIMD programm<strong>in</strong>g [10 pts.]Transform the follow<strong>in</strong>g program <strong>in</strong>to vector form.Vectorize as much as possiblek=ip=d(i)do j=i+1,nif (d(j)>=p) thenk=jp=d(j)end ifend doif (k /= i) thend(k)=d(i)d(i)=pdo j=1,np=v(j,i)v(j,i)=v(j,k)v(j,k)=pend doend if6


6. OpenMP [5 pts.] <strong>Parallel</strong>ize (as much as possible) the follow<strong>in</strong>g loop and present thesolution <strong>in</strong> OpenMPdo i=1,ndo j=1,nk=k+px = a(k,j) + b(j,i)c(i,j)= x + x ** 2 + a(k,j) + 1end doend do7


7. OpenMP [5 pts.] For each of the follow<strong>in</strong>g three loop nests, specify whether or notthe outermost loop can be transformed <strong>in</strong>to parallel form without chang<strong>in</strong>g the loopbody. If the answer is no, state the reason(a) do i=1,ndo j=1,na(i,j) = a(i,j-1) + 1b(i,j) = a(i,j) + a(i,j)end doend do(b) do i=1,ndo j=1,na(i,j) = a(i-1,j-1) + 1end doend do(c) do i=1,ndo j=1,na(i,j) = a(i-1,n+1) + 1b(i,j) = a(i-1,j) + a(i,j)end doend do8


8. Race conditions [5 pts.] In the follow<strong>in</strong>g four code segments, identify whether ornot there are race conditions. In each case, expla<strong>in</strong> why or why not and list all raceconditions you can detect. Also, <strong>in</strong>dicate what directives need to be <strong>in</strong>serted <strong>in</strong> eachcase to get rid of the race conditions.(a)c$ompc$ompc$ompc$ompc$ompc$ompparallel private(i)pdodo i=1,nx(i)=x(i)+1end doend pdo nowaitpdodo i=1,ny(i)=x(i) + 1end doend pdo nowaitend parallel9


(b)c$ompc$ompc$ompc$ompparallel private(i)pdodo i=1,nx(i)=x(i)+1end doend pdo waitx(5) = x(5) + 1end parallel10


(c)c$ompparallel dodo i=1,na(i)=b(i+1)+b(i-1)+a(i)end do(d)c$ompc$ompc$ompparallel private(i,j,k,t)k=some_function()t=y+x(k)barrierj=omp_get_thread_num()x(j)=tend parallel11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!