03.08.2013 Views

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

When the operating system process scheduler has available slack time, it halts the<br />

processor with this instruction. <strong>The</strong> processor remains in the halted state until receiving<br />

an interrupt. Though the interrupt can be an I/O device, it is typically the periodic OS<br />

timer that is used <strong>for</strong> process scheduling/preemption. This has a significant effect on<br />

power consumption <strong>by</strong> reducing processor idle power from ~36W to 9W. Because this<br />

significant effect is not reflected in the typical per<strong>for</strong>mance metrics, it is accounted <strong>for</strong><br />

explicitly in the halted cycles counter.<br />

Fetched µops – Micro-operations fetched. <strong>The</strong> micro-operations (µops) metric is used<br />

rather than an instruction metric to improve accuracy. Since in the P6 architecture<br />

instructions are composed of a varying number of µops, some instruction mixes give a<br />

skewed representation of the amount of computation being done. Using µops normalizes<br />

the metric to give representative counts independent of instruction mix. Also, <strong>by</strong><br />

considering fetched rather than retired µops, the metric is more directly related to power<br />

consumption. Looking only at retired µops would neglect work done in execution of<br />

incorrect branch paths and pipeline flushes.<br />

L3 Cache Misses – Loads/stores that missed in the Level 3 cache. Most system main<br />

memory accesses can be attributed to misses in the highest level cache, in this case L3.<br />

Cache misses can also be caused <strong>by</strong> DMA access to cacheable main memory <strong>by</strong> I/O<br />

devices. <strong>The</strong> miss occurs because the DMA must be checked <strong>for</strong> coherency in the<br />

processor cache.<br />

68

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!