01.09.2013 Views

Appendix G - Clemson University

Appendix G - Clemson University

Appendix G - Clemson University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

G-52 ■ <strong>Appendix</strong> G Vector Processors<br />

tional units and the increased complexity of assigning operations to units, all the<br />

overheads (T and T ) are doubled.<br />

loop start<br />

a. [15] Find the number of clock cycles for code sequence 1 on<br />

VMIPS.<br />

b. [20] Find the number of clock cycles on code sequence 1 for<br />

VMIPS-II. How does this compare to VMIPS?<br />

c. [15] Find the number of clock cycles on code sequence 2 for<br />

VMIPS.<br />

d. [15] Find the number of clock cycles on code sequence 2 for<br />

VMIPS-II. How does this compare to VMIPS?<br />

G.9 [20] Here is a tricky piece of code with two-dimensional arrays. Does this<br />

loop have dependences? Can these loops be written so they are parallel? If so,<br />

how? Rewrite the source code so that it is clear that the loop can be vectorized, if<br />

possible.<br />

do 290 j = 2,n<br />

do 290 i = 2,j<br />

aa(i,j)= aa(i-1,j)*aa(i-1,j)+bb(i,j)<br />

290 continue<br />

G.10 [12/15] Consider the following loop:<br />

do 10 i = 2,n<br />

A(i) = B<br />

10 C(i) = A(i-1)<br />

a. [12] Show there is a loop-carried dependence in this code fragment.<br />

b. [15] Rewrite the code in FORTRAN so that it can be vectorized as two<br />

separate vector sequences.<br />

G.11 [15/25/25] As we saw in Section G.5, some loop structures are not easily<br />

vectorized. One common structure is a reduction—a loop that reduces an array to<br />

a single value by repeated application of an operation. This is a special case of a<br />

recurrence. A common example occurs in dot product:<br />

dot = 0.0<br />

do 10 i=1,64<br />

10 dot = dot + A(i) * B(i)<br />

This loop has an obvious loop-carried dependence (on dot) and cannot be vectorized<br />

in a straightforward fashion. The first thing a good vectorizing compiler<br />

would do is split the loop to separate out the vectorizable portion and the recurrence<br />

and perhaps rewrite the loop as<br />

do 10 i=1,64<br />

10 dot(i) = A(i) * B(i)<br />

do 20 i=2,64<br />

20 dot(1) = dot(1) + dot(i)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!