13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

USING PERFORMANCE MONITORING EVENTSB.1.2Counting ClocksThe count of cycles (known as clock ticks) forms a fundamental basis for measuringhow long a program takes to execute. The count is also used as part of efficiencyratios like cycles-per-instruction (CPI). Some processor clocks may stop tickingunder certain circumstances:• The processor is halted (for example: during I/O). There may be nothing for theCPU to do while servicing a disk read request <strong>and</strong> the processor may halt to savepower. When HT Technology is enabled, both logical processors must be haltedfor performance-monitoring-related counters to be powered down.• The processor is asleep, either as a result of being halted for a while or as part ofa power-management scheme. There are different levels of sleep. In the deepersleep levels, the time-stamp counter stops counting.Three mechanisms to count processor clock cycles for monitoring performance are:• Non-Halted Clock Ticks — Clocks when the specified logical processor is nothalted nor in any power-saving states. These can be measured on a per-logicalprocessorbasis, when HT Technology is enabled.• Non-Sleep Clock Ticks — Clocks when the physical processor is not in any ofthe sleep modes, nor power-saving states. These cannot be measured on a perlogical-processor basis.• Time-stamp Counter — Clocks when the physical processor is not in deepsleep. These cannot be measured on a per-logical-processor basis.The first two metrics use performance counters <strong>and</strong> can cause an interrupt uponoverflow for sampling. They may also be useful for cases where it is easier for a toolto read a performance counter instead of the time-stamp counter. The time-stampcounter is accessed using an RDTSC instruction.For applications with a significant amount of I/O, there are two ratios of interest:• Non-Halted CPI — Non-halted clock ticks/instructions retired measures the CPIfor the phases where the CPU was being used. This ratio can be measured on aper- logical-processor basis, when HT Technology is enabled.• Nominal CPI — Time-stamp counter ticks/instructions retired measures the CPIover the entire duration of the program, including those periods the machine ishalted while waiting for I/O.The distinction between the two CPI is important for processors that support HTTechnology. Non-halted CPI should use the “non-halted clock ticks” performancemetric in the numerator. Nominal CPI should use “non-sleep clock ticks” in thenumerator. “non-sleep clock ticks” is the same as the “clock ticks” metric in previouseditions of this manual.B-3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!