13.07.2015 Views

More Iteration Space Tiling

More Iteration Space Tiling

More Iteration Space Tiling

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Program 10:do I = I, 5doallJ=l, 10A(1.J) = B(I,J) + C(I)*D(J)enddoenddoJ=l 2 3 4 5 6 7 8 9 10Figure 5.It is also obvious that a sequential loop may be convertedinto a doall when it carries no dependence relationsFor example, in Program 11 there is a flowdependencerelation S1 6f1,1j S, due to the assignmentand subsequent use of A. Even though the distance in theJ loop dimension is non-zero, it may be executed in parallelsince the only dependence relation is curried by theouter I loop. The outer I loop can be executed in parallelonly by the insertion of synchrolzization primitives.Program 11:do I = 2, Ndo J = 3, MS 1: A(1.J) = A(I-l,J-2) + C(I)*D(J)enddoenddo3. Restructuring TransformationsThe most powerful compilers and translators arecapable of advanced program restructuring transformationsto optimize performance on high speed parallel computers.Automatic conversion of sequential code to parallelcode is one example of program restructuring. Associatedwith each restructuring transformation is a datadependence test which must be satisfied by each dependencerelation in order to apply that transformation. Aswe have already seen, converting a sequential loop to aparallel doall requires that the Ioop carries no dependence.This parallelization has no effect on the data dependencegraph, though we will see that other transformations dochange data dependence relations somewhat.Loop Interchanging: One of the most importantrestructuring transformations is loop interchanging. Interchangingtwo loops can be used with several different goalsin mind. As shown above, the outer loop of Program 11cannot be converted to a parallel doall without additionalsynchronization, However, the two loops can be inter-changed, producing Program 12. Loop interchanging islegal if there are no dependence relations that are carriedby the outer loop and have a negative distance in the innerloop (i.e., no () direction vectors [AlKe84, WoBa87]).The distance or direction vector for the data dependencerelation in the interchanged loop has the correspondingelements interchanged, giving the dependence relationS, 6(,:,) S,. Since the outerm,ost loop with a positive distanceis the outer J loop, the J loop carries this dependence;now the I loop carries no dependences and can beexecuted in parallel. Loop interchanging thus enablesparallel execution of other loops; this may be desirable if,for instance, it is known that M is very small (so parallelexecution of the J loop would give little speedup) or ifparallel access to the second dimension of A would producememory conflicts.Program 12:do J = 3, Mdo I = 2, NS 1: A(I,J) = A(I-l,J-2) + C(I)*D(J)enddoenddoLoop Skewing: Some nested loops have dependencerelations carried by each loop, preventing parallel executionof any of the loops. An example of this is the relaxationalgorithm shown in Program 13a. The data dependencerelations in the iteration space of this loop areshown in Figure 6; the four dependence relations have distancevectors:Sl ~(CLl) SlSl J(1.0) SlSl b(O.1) Sl Sl T&O) SlOne way to extract parallelism from this loop is via thewavefront (or hyperplane) method [Mura71, Lamp74]. Weshow how to implement the wavefront method via loopskewing and loop interchanging [Wolf86].Program 13a:do I = 2, N-ldo J = 2, M-lS 1: A(I,J)=O.2*(A(I-l,J)+A(I,J-1)+A(I,J)+A(I+l,J)+A(I,J+l))enddoenddoFigure 6.658

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!