12.07.2015 Views

Non-linear memory layout transformations and data prefetching ...

Non-linear memory layout transformations and data prefetching ...

Non-linear memory layout transformations and data prefetching ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.2 Cache misses 13During a <strong>memory</strong> reference, a request for specic <strong>data</strong> can be satised from the cache withoutusing the main <strong>memory</strong>, when the requested <strong>memory</strong> address is present in cache. This situationis known as a cache hit. The opposite situation, when the cache is consulted <strong>and</strong> found not tocontain the desired <strong>data</strong>, is known as a cache miss. In the latter case, the block of <strong>data</strong> (wherethe requested element belongs), is usually inserted into the cache, ready for the next access.This block is the smaller amount of elements that can be transferred from one <strong>memory</strong> level toanother. This block of <strong>data</strong> is alternatively called cache line.Memory latency <strong>and</strong> b<strong>and</strong>width are the two key factors that determine the time needed toresolve a cache miss. Memory latency species the intermediate time between the <strong>data</strong> requestto main <strong>memory</strong> <strong>and</strong> the arrival into the cache of the rst <strong>data</strong> element in the requested block.Memory b<strong>and</strong>width species the arrival rate of the remaining elements in the requested block.A cache miss resolution is critical, because the processor has to stall until the requested <strong>data</strong>arrive from main <strong>memory</strong> (especially in an in-order execution).2.2 Cache missesCache memories were designed to keep the most recently used piece of content (either a programinstruction or <strong>data</strong>). However, it is not feasible to satisfy all <strong>data</strong> requests. In a case of a cachemiss in instruction cache, the processor stall is resolved when the requested instruction is fetchedfrom main <strong>memory</strong>. A cache read miss (<strong>data</strong> load instruction) can be less severe as there canbe other instructions not dependant on the expected <strong>data</strong> element. Execution is continued untilthe operation which really needs the loaded <strong>data</strong> is ready to be executed. However, <strong>data</strong> is oftenused immediately after the load instruction. The last case is a cache write miss, it is the leastserious miss because there are write buers usually, to store <strong>data</strong> until they are transferred tomain <strong>memory</strong> or a block is allocated in the cache. The processor can continue until the bueris full.In order to lower cache miss rate, a great deal of analysis has been done on cache behavior inan attempt to nd the best combination of size, associativity, block size, <strong>and</strong> so on. Sequences of<strong>memory</strong> references performed by benchmark programs are saved as address traces. Subsequentanalysis simulates many dierent possible cache designs on these long address traces. Makingsense of how the many variables aect the cache hit rate can be quite confusing. One signicantcontribution to this analysis was made by Mark Hill, who separated misses into three categories(known as the Three Cs):There are three dierent types of cache misses.• Compulsory misses: are those misses caused by the very rst reference to a block of<strong>data</strong>. They are alternatively called cold-start or rst-reference misses. Cache capacity<strong>and</strong> associativity do not aect the number of compulsory misses that come up by an

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!