12.07.2015 Views

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

386 chapter 14as high-performance computers have become more prevalent. For systems that usea data cache, this may well be the single most important programming consideration;continually referencing data that are not in the cache (cache misses) may leadto an order-of-magnitude increase in CPU time.As indicated in Figures 14.2 and 14.14, the data cache holds a copy of someof the data in memory. The basics are the same for all caches, but the sizes aremanufacturer-dependent. When the CPU tries to address a memory location, thecache manager checks to see if the data are in the cache. If they are not, the managerreads the data from memory into the cache, and then the CPU deals with the datadirectly in the cache. The cache manager’s view of RAM is shown in Figure 14.14.When considering how a matrix operation uses memory, it is important to considerthe stride of that operation, that is, the number of array elements that arestepped through as the operation repeats. For instance, summing the diagonalelements of a matrix to form the traceTr A =N∑a(i, i) (14.15)i=1involves a large stride because the diagonal elements are stored far apart for largeN. However, the sumc(i)=x(i)+x(i +1) (14.16)has stride 1 because adjacent elements of x are involved. The basic rule inprogramming for a cache is• Keep the stride low, preferably at 1, which in practice means.• Vary the leftmost index first on Fortran arrays.• Vary the rightmost index first on Java and C arrays.14.15.1 Exercise 1: Cache MissesWe have said a number of times that your program will be slowed down if thedata it needs are in virtual memory and not in RAM. Likewise, your program willalso be slowed down if the data required by the CPU are not in the cache. Forhigh-performance computing, you should write programs that keep as much ofthe data being processed as possible in the cache. To do this you should recall thatFortran matrices are stored in successive memory locations with the row indexvarying most rapidly (column-major order), while Java and C matrices are storedin successive memory locations with the column index varying most rapidly(row-major order). While it is difficult to isolate the effects of the cache from otherelements of the computer’s architecture, you should now estimate its importanceby comparing the time it takes to step through the matrix elements row by row tothe time it takes to step through the matrix elements column by column.−101<strong>COPYRIGHT</strong> <strong>2008</strong>, PRINCET O N UNIVE R S I T Y P R E S SEVALUATION COPY ONLY. NOT FOR USE IN COURSES.ALLpup_06.04 — <strong>2008</strong>/2/15 — Page 386

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!