13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

USING PERFORMANCE MONITORING EVENTSIn the event descriptions in Table B-1, the term “bogus” refers to instructions or microopsthat must be cancelled because they are on a path taken from a mispredictedbranch. The terms “retired” <strong>and</strong> “non-bogus” refer to instructions or μops along thepath that results in committed architectural state changes as required by the programexecution. Instructions <strong>and</strong> μops are either bogus or non-bogus, but not both.B.1.1.2Bus RatioBus Ratio is the ratio of the processor clock to the bus clock. In the Bus Utilizationmetric, it is the bus_ratio.B.1.1.3ReplayIn order to maximize performance for the common case, the Intel NetBurst microarchitecturesometimes aggressively schedules μops for execution before all the conditionsfor correct execution are guaranteed to be satisfied. In the event that all ofthese conditions are not satisfied, μops must be re-issued. This mechanism is calledreplay.Some occurrences of replays are caused by cache misses, dependence violations (forexample, store forwarding problems), <strong>and</strong> unforeseen resource constraints. Innormal operation, some number of replays are common <strong>and</strong> unavoidable. An excessivenumber of replays indicate that there is a performance problem.B.1.1.4AssistWhen the hardware needs the assistance of microcode to deal with some event, themachine takes an assist. One example of such situation is an underflow condition inthe input oper<strong>and</strong>s of a floating-point operation.The hardware must internally modify the format of the oper<strong>and</strong>s in order to performthe computation. Assists clear the entire machine of mops before they begin to accumulate,<strong>and</strong> are costly. The assist mechanism on the Pentium 4 processor is similarin principle to that on the Pentium II processors, which also have an assist event.B.1.1.5TaggingTagging is a means of marking μops to be counted at retirement. See Appendix A ofthe Intel® <strong>64</strong> <strong>and</strong> <strong>IA</strong>-<strong>32</strong> <strong>Architectures</strong> Software Developer’s <strong>Manual</strong>, Volume 3B, forthe description of tagging mechanisms.The same event can happen more than once per μop. The tagging mechanisms allowa μop to be tagged once during its lifetime. The retired suffix is used for metrics thatincrement a count once per μop, rather than once per event. For example, a μop mayencounter a cache miss more than once during its life time, but the misses retiredmetric (for example, 1 st -Level Cache Misses Retired) will increment only once for thatμop.B-2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!