12.07.2015 Views

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

high-performance computing hardware 387Cache256 lines of 128b (32KB)Virtual MemoryFigure 14.14 The cache manager’s view of RAM. Each 128-B cache line is read into one offour lines in cache.By actually running on machines available to you, check that the two simplecodes in Listing 14.7 with the same number of arithmetic operations take significantlydifferent times to run because one of them must make large jumps throughmemory with the memory locations addressed not yet read into the cache:✞☎x(j) = m(1,j) // Sequential column reference✝✞✝for j = 1, 9999;x(j) = m(j,1) // Sequential row reference☎Listing 14.7 Sequential column and row references.14.15.2 Exercise 2: Cache FlowTest the importance of cache flow on your machine by comparing the time it takesto run the two simple programs in Listings 14.8 and 14.9. Run for increasing columnsize idim and compare the times for loop A versus those for loop B. A computerwith very small caches may be most sensitive to stride.✞☎Dimension Vec( idim , jdim ) // Stride 1 fetch (f90)for j = 1 , jdim ; { for i =1, idim ; Ans = Ans + Vec( i , j )∗Vec ( i , j ) }✝Listing 14.8 GOOD f90, BAD Java/C Program; minimum, maximum stride.−101<strong>COPYRIGHT</strong> <strong>2008</strong>, PRINCET O N UNIVE R S I T Y P R E S SEVALUATION COPY ONLY. NOT FOR USE IN COURSES.ALLpup_06.04 — <strong>2008</strong>/2/15 — Page 387

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!