03.08.2013 Views

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

of Quad-Core AMD processors, this is the dominant effect. When an active core<br />

per<strong>for</strong>ms a cache probe of an idle core, latency is increased compared to probing an<br />

active core. <strong>The</strong> per<strong>for</strong>mance loss can be significant <strong>for</strong> memory-bound (cache probe-<br />

intensive) workloads. Direct per<strong>for</strong>mance effects are due to the current operating<br />

frequency of an active core. <strong>The</strong> effect tends to be less compared to indirect, since<br />

operating systems are reasonably effective at matching current operating frequency to<br />

per<strong>for</strong>mance demand. <strong>The</strong>se effects are illustrated in Figure 6.1.<br />

Two extremes of workloads are presented: the compute-bound crafty and the memory-<br />

bound equake. For each workload, two cases are presented: fixed and normal scheduling.<br />

Fixed scheduling isolates indirect per<strong>for</strong>mance loss <strong>by</strong> eliminating the effect of OS<br />

frequency scheduling and thread migration. This is accomplished <strong>by</strong> <strong>for</strong>cing the<br />

software thread to a particular core <strong>for</strong> the duration of the experiment. In this case, the<br />

thread runs always run at the maximum frequency. <strong>The</strong> idle cores always run at the<br />

minimum frequency. As a result, crafty achieves 100 percent of the per<strong>for</strong>mance of<br />

processor that does not use dynamic power management. In contrast, the memory-bound<br />

equake shows significant per<strong>for</strong>mance loss due to the reduced per<strong>for</strong>mance of idle cores.<br />

Direct per<strong>for</strong>mance loss is shown in the dark solid and light solid lines, which utilize OS<br />

scheduling of frequency and threads. Because direct per<strong>for</strong>mance losses are caused <strong>by</strong><br />

suboptimal frequency in active cores, the compute-bound crafty shows a significant<br />

per<strong>for</strong>mance loss. <strong>The</strong> memory-bound equake actually shows a per<strong>for</strong>mance<br />

106

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!