Commonly Used Metrics for Performance Analysis - Power.org

More documents

Recommendations

Info

Metrics for Performance AnalysisTable 10‐1 L1 cache data reloads events with their corresponding event groups and brief description 33Table 10‐2 Events sorted by Group................................................................................................ 33Table 11‐1 Memory location events with their corresponding event groups and brief description36Table 12‐1 Address translation events with their corresponding event groups and brief description39Table 12‐2 Events sorted by Group................................................................................................ 39Table 13‐1 Events for instruction statistics with their corresponding event groups and brief description 42Table 13‐2 Events sorted by Group................................................................................................ 42Table 14‐1 Memory location events with their corresponding event groups and brief description45Table 15‐1 L2 read‐claim machine events with their corresponding event groups and brief description 46Copyright ©2011 IBM Corporation Page 4 of 52
Metrics for Performance Analysis1 Performance Event Data for Application OptimizationFirst, this paper briefly covers the POWER7 execution pipeline and the PMU hardware. Then it introducessome AIX and Linux tools that can be used to collect hardware events. Finally, the paper discusses severaluseful sets of metrics.The first step in optimizing an application is characterizing how well the application runs on a POWER7system. The fundamental intensive metric used to characterize the performance of any givenprogram/workload is CPI (Cycles Per Instruction) – the average number of clock cycles (or fractions of acycle) needed to complete an instruction. CPI is best understood as a relative quantity. Lower is better, butthat assumes that useful work is being done. For a given set of calculations (an execution path), the lower theCPI, the more effectively the processor hardware is being kept busy. Note that the CPI is a measure ofprocessor performance, “How busy is the system hardware?” which is a narrower question than “Can aprogram be sped up?”The CPI stack (also referred to as a “CPI stall analysis”) hierarchically breaks down the CPI based on whatthe execution pipeline is doing (or not doing) at any given cycle on a per-hardware-thread basis. It is used toanswer “What are the main front-end and back-end delays encountered while executing?”The CPI stack uses data from the PMU (Performance Monitoring Unit) hardware in the POWER7 chip.Focusing on the core (and not the “nest” – the subsystems that transfer data to and from memory), dataaccess accounting is simplified – either the data is found in L1 cache or it’s not (and there is a processingdelay). And performance data for disk I/O (and other “slow” hardware interrupts like networking) are excluded.Many other metrics are also useful in both characterizing how well an application runs on a POWER7 systemand how efficiently the application uses the available hardware resources. These include metrics for memorybandwidth, L1 cache instruction and data behavior, branch prediction, data locality, address translation,flushes and read-claim machines. While not an exhaustive list, these metrics do cover several common areasof concern.Copyright ©2011 IBM Corporation Page 5 of 52
Page 6 and 7: Metrics for Performance Analysis2 T
Page 8 and 9: Metrics for Performance Analysis3 T
Page 10 and 11: Metrics for Performance Analysistha
Page 12 and 13: Metrics for Performance Analysisact
Page 14 and 15: Metrics for Performance Analysisexp
Page 16 and 17: Metrics for Performance Analysis5 L
Page 18 and 19: Metrics for Performance AnalysisUni
Page 20 and 21: Metrics for Performance AnalysisIFU
Page 22 and 23: Metrics for Performance AnalysisTab
Page 24 and 25: Metrics for Performance Analysis7 M
Page 26 and 27: Metrics for Performance Analysis8 L
Page 28 and 29: Metrics for Performance Analysis9 B
Page 30 and 31: Metrics for Performance Analysis9.3
Page 32 and 33: Metrics for Performance Analysisdem
Page 34 and 35: Metrics for Performance Analysispm_
Page 36 and 37: Metrics for Performance Analysiscol
Page 38 and 39: Metrics for Performance AnalysisFor
Page 40 and 41: Metrics for Performance Analysis13
Page 42 and 43: Metrics for Performance AnalysisINS
Page 48 and 49: Metrics for Performance AnalysisApp
Page 50 and 51: Metrics for Performance AnalysisThi
Page 52:
Metrics for Performance AnalysisTel
show all

Commonly Used Metrics for Performance Analysis - Power.org

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?