21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

226 M. Goudarzi, T. Ishihara, and H. Noori<br />

mode. We show this value-dependence exists, with <strong>in</strong>creas<strong>in</strong>g significance, <strong>in</strong> nanoscale<br />

SRAM cells and can benefit power sav<strong>in</strong>g even out of standby time.<br />

Register-renam<strong>in</strong>g is a well-known technique that is often used <strong>in</strong> highperformance<br />

comput<strong>in</strong>g to elim<strong>in</strong>ate false dependence among <strong>in</strong>structions that<br />

otherwise could not have been executed <strong>in</strong> parallel. It is usually applied dynamically<br />

at runtime, but we apply it statically to avoid runtime overhead. To the best of our<br />

knowledge, register-renam<strong>in</strong>g has not been used <strong>in</strong> the past for power reduction.<br />

Cache-<strong>in</strong>itialization, normally done at processor reset, is traditionally limited to<br />

resett<strong>in</strong>g all valid-bites to <strong>in</strong>dicate empt<strong>in</strong>ess of the entire cache. We extend this<br />

<strong>in</strong>itialization to store less-leaky values <strong>in</strong> all those cache-l<strong>in</strong>es that won’t be used by<br />

the embedded application. This is similar to cache-decay [9] <strong>in</strong> address<strong>in</strong>g leakage<br />

power dissipated by unused cache-l<strong>in</strong>es, but our technique does not require circuitlevel<br />

modification of the cache design that has prevented cache-decay from<br />

widespread adoption.<br />

3 Motivation and Our Approach<br />

Leakage is <strong>in</strong>creas<strong>in</strong>g <strong>in</strong> nanometer-scale technologies, especially <strong>in</strong> cache memories<br />

which comprise the largest part of processor-based embedded systems. Fig. 1 shows<br />

the breakdown of energy consumption of the 8KB <strong>in</strong>struction-cache of M32R embedded<br />

processor [13] runn<strong>in</strong>g MPEG2 application. The figure clearly shows that although<br />

dynamic energy decreases with every technology node, the static (leakage) energy<br />

<strong>in</strong>creases such that, unlike <strong>in</strong> micrometer technologies, total energy of the cache <strong>in</strong>creases<br />

with the shr<strong>in</strong>k<strong>in</strong>g feature sizes. Thus it is <strong>in</strong>creas<strong>in</strong>gly more important to address leakage<br />

reduction <strong>in</strong> cache memories <strong>in</strong> nanometer technologies.<br />

We focus on Ioff as the primary contributor to leakage <strong>in</strong> nanometer caches [13]. Fig. 2<br />

shows a 6-transistor SRAM cell stor<strong>in</strong>g a 1 logic value. Clearly, only M5, M2, and M1<br />

transistors can leak <strong>in</strong> this state while the other three may leak only when the cell stores a<br />

0 (note that bit-l<strong>in</strong>es are precharged to supply voltage, VDD). Process variation, especially<br />

Energy Consumption (uJ)<br />

350<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

Static energy Dynamic energy<br />

30%<br />

50%<br />

54%<br />

180nm 90nm 65nm 45nm<br />

Manufactur<strong>in</strong>g Technology<br />

Fig. 1. Cache energy consumption <strong>in</strong> various technology nodes<br />

M32R RISC processor<br />

•�200MHz •�8KB, 2way I-cache<br />

•�Application: MPEG2<br />

•�Static and dynamic<br />

power obta<strong>in</strong>ed from<br />

CACTI ver. 4, 5 [15]<br />

•�Miss penalty: 40 clock,<br />

40nJ

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!