03.08.2013 Views

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

Copyright by William Lloyd Bircher 2010 - The Laboratory for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

the number of cores has the same effect. <strong>The</strong>re<strong>for</strong>e, multi-socket systems tend to have a<br />

higher FreqB. Assuming at least one idle core, the per<strong>for</strong>mance loss increases as the ratio<br />

of active-to-idle cores increases. For an N-core processor, the worst-case is N-1 active<br />

cores with 1 idle core. To reduce indirect per<strong>for</strong>mance loss, the system should be<br />

configured to guarantee than the minimum frequency of idle cores is greater than or equal<br />

to FreqB. To this end the Quad-Core AMD processor uses clock ramping in response to<br />

probes [Bk09]. When an idle core receives a probe from an active core, the idle core<br />

frequency is ramped to the last requested p-state frequency. <strong>The</strong>re<strong>for</strong>e, probe response<br />

per<strong>for</strong>mance is dictated <strong>by</strong> the minimum idle core frequency supported <strong>by</strong> the processor.<br />

For the majority of workloads, this configuration yield less than 10 percent per<strong>for</strong>mance<br />

loss due to idle core probe latency.<br />

<strong>The</strong> other factors in indirect per<strong>for</strong>mance loss are due to the operating system interaction<br />

with power management. <strong>The</strong>se factors, which include OS p-state transition and<br />

scheduling characteristics, tend to mask the indirect per<strong>for</strong>mance loss. Ideally, the OS<br />

selects a high frequency p-state <strong>for</strong> active cores and a low frequency <strong>for</strong> idle cores.<br />

However, erratic workloads (many phase transitions) tend to cause high error rates in the<br />

selection of optimal frequency. Scheduling characteristics that favor load-balancing over<br />

processor affinity worsen the problem. Each time the OS moves a process from one core<br />

to another, a new phase transition has effectively been introduced. More details of OS p-<br />

state transitions and scheduling characteristics are given in the next section on direct<br />

per<strong>for</strong>mance effects.<br />

108

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!