21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

230 M. Goudarzi, T. Ishihara, and H. Noori<br />

Illustrative Example 3: Initializ<strong>in</strong>g Unused Cache-L<strong>in</strong>es. Depend<strong>in</strong>g on the<br />

cache size and the application, some parts of the <strong>in</strong>struction cache may never be<br />

used dur<strong>in</strong>g application execution. Fig. 6 shows the histogram of cache-fill<br />

operations <strong>in</strong> the 8KB <strong>in</strong>struction cache of M32R processor [13] (a 32-bit RISC<br />

processor) when execut<strong>in</strong>g FFT application. 69 out of the 512 16-byte cache-l<strong>in</strong>es<br />

are never used <strong>in</strong> this case. We propose to <strong>in</strong>itialize such unused cache-l<strong>in</strong>es with<br />

values that best match the leakage-preference of their SRAM cells. Many<br />

processors today are equipped with cache-management <strong>in</strong>structions (e.g. ARM10<br />

family [21] and NEC V830R processor [22]) that can load arbitrary values to every<br />

cache location. Us<strong>in</strong>g these <strong>in</strong>structions, the unused cache-l<strong>in</strong>es can be <strong>in</strong>itialized<br />

at boot time to effectively reduce their leakage-power dur<strong>in</strong>g the entire application<br />

execution. For <strong>in</strong>stance, if <strong>in</strong> Fig. 5 cache-l<strong>in</strong>e number 490 were not to be used at<br />

all by the application, it would be <strong>in</strong>itialized to 00000111 to fully match its<br />

leakage-preference. A m<strong>in</strong>imum power-ON duration is required to break even the<br />

dynamic energy for cache <strong>in</strong>itialization and the leakage energy saved. We consider<br />

this <strong>in</strong> our problem formulation and experiments.<br />

Number of Cache Writes<br />

way 1 way 2<br />

100%<br />

90% 80<br />

80% 60<br />

70% 40<br />

60% 20<br />

50%<br />

40% 80<br />

30% 60<br />

20% 40<br />

10% 20<br />

Unused cache-l<strong>in</strong>es<br />

0%<br />

0 20 40 60 80 100 120 140 160 180 200 220 240<br />

Cache-set <strong>in</strong>dex<br />

Fig. 6. Unused cache-l<strong>in</strong>es for FFT application (8KB 2-way cache with 16-byte cache-l<strong>in</strong>es)<br />

Leakage-Preference Detection. This can be <strong>in</strong>corporated <strong>in</strong> the manufactur<strong>in</strong>g test<br />

procedure that is applied to each chip after fabrication. Usually walk<strong>in</strong>g-1 and<br />

walk<strong>in</strong>g-0 test sequences are applied to memory devices [23] to test them for stuck-at<br />

and bridg<strong>in</strong>g faults. Leakage current can be measured at each step of this test<br />

procedure (similar to delta-IDDQ test<strong>in</strong>g [24]) to determ<strong>in</strong>e the leakage-preference of<br />

cells. This can even be done <strong>in</strong>-house s<strong>in</strong>ce commodity ammeters can easily measure<br />

down to 0.1fA [25] while the nom<strong>in</strong>al leakage of a m<strong>in</strong>imum geometry transistor is<br />

345pA <strong>in</strong> 90nm process available to us. For some cells, this difference may be<br />

negligible, but one can detect more important cells that cause larger leakage<br />

differences. Test time for an 8KB cache, assum<strong>in</strong>g 1MHz current measurements,<br />

would be 128ms (measur<strong>in</strong>g leak0 and leak1 for each SRAM cell).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!