13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

POWER OPTIMIZATION FOR MOBILE USAGES10.4.7.1 Enhanced Intel SpeedStep ® TechnologyUsing domain-composition, a single-threaded application can be transformed to takeadvantage of multicore processors. A transformation into two domain threads meansthat each thread will execute roughly half of the original number of instructions. Dualcore architecture enables running two threads simultaneously, each thread usingdedicated resources in the processor core. In an application that is targeted for themobile usages, this instruction count reduction for each thread enables the physicalprocessor to operate at lower frequency relative to a single-threaded version. This inturn enables the processor to operate at a lower voltage, saving battery life.Note that the OS views each logical processor or core in a physical processor as aseparate entity <strong>and</strong> computes CPU utilization independently for each logicalprocessor or core. On dem<strong>and</strong>, the OS will choose to run at the highest frequencyavailable in a physical package. As a result, a physical processor with two cores willoften work at a higher frequency than it needs to satisfy the target QOS.For example if one thread requires 60% of single-threaded execution cycles <strong>and</strong> theother thread requires 40% of the cycles, the OS power management may direct thephysical processor to run at 60% of its maximum frequency.However, it may be possible to divide work equally between threads so that each ofthem require 50% of execution cycles. As a result, both cores should be able tooperate at 50% of the maximum frequency (as opposed to 60%). This will allow thephysical processor to work at a lower voltage, saving power.So, while planning <strong>and</strong> tuning your application, make threads as symmetric aspossible in order to operate at the lowest possible frequency-voltage point.10.4.7.2 Thread Migration ConsiderationsInteraction of OS scheduling <strong>and</strong> multicore unaware power management policy maycreate some situations of performance anomaly for multi-threaded applications. Theproblem can arise for multithreading application that allow threads to migrate freely.When one full-speed thread is migrated from one core to another core that has idledfor a period of time, an OS without a multicore-aware P-state coordination policy maymistakenly decide that each core dem<strong>and</strong>s only 50% of processor resources (basedon idle history). The processor frequency may be reduced by such multicore unawareP-state coordination, resulting in a performance anomaly. See Figure 10-5.10-11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!