Copyright by William Lloyd Bircher 2010 - The Laboratory for ...
Copyright by William Lloyd Bircher 2010 - The Laboratory for ...
Copyright by William Lloyd Bircher 2010 - The Laboratory for ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
5.3.3 Per<strong>for</strong>mance Event Selection<br />
Per<strong>for</strong>mance event selection is critical to the success of per<strong>for</strong>mance counter-driven<br />
power models. To identify a minimum set of representative per<strong>for</strong>mance events, the<br />
relationship between each event and power consumption must be understood. This<br />
section describes the per<strong>for</strong>mance monitoring counter events used to construct the trickle-<br />
down power model. <strong>The</strong> definition and insight behind selection of the counters is<br />
provided.<br />
Fetched µops – Micro-operations fetched. Comparable to the Pentium IV fetched micro-<br />
operations, this metric is highly correlated to processor power. It accounts <strong>for</strong> the largest<br />
portion of core pipeline activity including speculation. This is largely the result of fine-<br />
grain clock gating. Clocks are gated to small portions of the pipelines when they are not<br />
being used.<br />
FP µops Retired – Floating point micro-operations retired. FP µops Retired accounts<br />
<strong>for</strong> the difference in power consumption between floating point and integer instructions.<br />
Assuming equal throughput, floating point instructions have significantly higher average<br />
power. Ideally, the number of fetched FPU µops would be used. Un<strong>for</strong>tunately, this<br />
metric is not available as a per<strong>for</strong>mance counter. This is not a major problem though<br />
since the fetched µops metric contains all fetched µops, integer and floating point.<br />
DC Accesses – Level 1 Data Cache Accesses. A proxy <strong>for</strong> overall cache accesses<br />
including Level 1,2,3 data and instruction. Considering the majority of workloads, level<br />
88