15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

used in reporting maximum and average power ratings for future processors. It should be possible in<br />

future for customers to compare power-performance efficiencies across competing products in a given<br />

market segment, i.e., for a given benchmark suite.<br />

7.3 A Review of Key Ideas in Power-Aware<br />

Microarchitectures<br />

In this chapter, we limit our attention to dynamic (“switching”) power governed by the CV 2 af formula.<br />

Recall that C refers to the switching capacitance, V is the supply voltage, a is the activity factor (0 < a < 1),<br />

and f is the operating clock frequency. Power reduction ideas must, therefore, focus on one or more of<br />

these basic parameters. In this section, we examine the key ideas that have been proposed in terms of<br />

(micro)architectural support for power-efficiency.<br />

The effective (average) value of C can be reduced by using: (a) area-efficient designs for various macros;<br />

(b) adaptive structures, that change in effective size, latency or communication bandwidth depending<br />

on the needs of the input workload; (c) selectively “powering off” unused or idle units, based on special<br />

“nap/doze” and “sleep” instructions generated by the compiler or detected via hardware mechanisms;<br />

(d) reducing or eliminating “speculative waste” resulting from executing instructions in mis-speculated<br />

branch paths or prefetching useless instructions and data into caches, based on wrong guesses.<br />

The average value of V can be reduced via dynamic voltage scaling, i.e., by reducing the voltage as and<br />

when required or possible (e.g., see the description of the Transmeta chip: http://www.transmeta.com).<br />

Microarchitectural support, in this case, is not required, unless the mechanisms to detect “idle” periods<br />

or temperature overruns are detected using counter-based “proxies,” specially architected for this purpose.<br />

© 2002 by CRC Press LLC<br />

TABLE 7.1 Rank Ordering Based on Specint95 and Alternate Performance-<br />

Power Efficiency Metrics<br />

Rank SPECint/watt SPECint^2/watt SPECint^3/watt<br />

1 Moto PPC7400 (450 MHz) Intel PIII-1000 Intel PIII-1000<br />

2 Intel Celeron (333 MHz) Moto PPC7400-450 AMD Athlon-1000<br />

3 Intel PIII (1000 MHz) AMD Athlon-1000 HP-PA8600-552<br />

4 MIPS R12000 (300 MHz) HP-PA8600-552 Moto PPC7400-450<br />

5 Sun USII (450 MHz) Intel Celeron-333 Alpha 21264-700<br />

6 AMD Athlon (1000 MHz) Alpha 21264-700 IBM Power3-450<br />

7 IBM Power3 (450 MHz) MIPS R12000-300 MIPS R12000-300<br />

8 HP-PA8600 (552 MHz) IBM Power3-450 Intel Celeron-333<br />

9 Alpha 21264 (700 MHz) Sun USII-450 Sun USII-450<br />

10 Hal Sparc64-III Hal Sparc64-III Hal Sparc64-III<br />

TABLE 7.2 Rank Ordering Based on Specfp95 and Alternate<br />

Performance-Power Efficiency Metrics<br />

Rank SPECfp/watt SPECfp^2/watt SPECfp^3/watt<br />

1 Moto PPC7400-450 HP-PA8600-552 HP-PA8600-552<br />

2 MIPS R12000-300 IBM Power3-450 IBM Power3-450<br />

3 IBM Power3-450 MIPS R12000-300 Alpha 21264-700<br />

4 Intel Celeron-333 Alpha 21264-700 MIPS R12000-300<br />

5 Sun USII-450 Intel PIII-1000 Intel PIII-1000<br />

6 HP-PA8600-552 Moto PPC7400-450 Sun USII-450<br />

7 Intel PIII-1000 Sun USII-450 Moto PPC7400-450<br />

8 Alpha 21264-700 Hal Sparc64-III AMD Athlon-1000<br />

9 Hal Sparc64-III AMD Athlon-1000 Hal Sparc64-III<br />

10 AMD Athlon-1000 Intel Celeron-333 Intel Celeron-333

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!