13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

USING PERFORMANCE MONITORING EVENTSwant to do cycle-by-cycle or type-by-type analysis should be aware that this event isknown to be inaccurate for “UC Reads Chunk Underway” <strong>and</strong> “Write WC partialunderway” metrics. Relative changes to the average of all BSQ latencies should beviewed as an indication that overall memory performance has changed. Thatmemory performance change may or may not be reflected in the measured FSBlatencies.For Pentium 4 <strong>and</strong> Intel Xeon Processor implementations with an integrated 3rd-levelcache, BSQ entries are allocated for all 2nd-level writebacks (replaced lines), not justthose that become bus accesses (i.e., are also 3rd-level misses). This can decreasethe average measured BSQ latencies for workloads that frequently thrash (miss orprefetch a lot into) the 2nd-level cache but hit in the 3rd-level cache. This effect maybe less of a factor for workloads that miss all on-chip caches, since all BSQ entriesdue to such references will become bus transactions.B.3 PERFORMANCE METRICS AND TAGGINGMECHANISMSA number of metrics require more tags to be specified in addition to programming acounting event. For example, the metric Split Loads Retired requires specifying asplit_load_retired tag in addition to programming the replay_event to count at retirement.This section describes three sets of tags that are used in conjunction withthree at-retirement counting events: front_end_event, replay_event, <strong>and</strong>execution_event. Please refer to Appendix A of the Intel® <strong>64</strong> <strong>and</strong> <strong>IA</strong>-<strong>32</strong> <strong>Architectures</strong>Software Developer’s <strong>Manual</strong>, Volume 3B for the description of the at-retirementevents.B.3.1Tags for replay_eventTable B-8 provides a list of the tags that are used by various metrics in Tables B-1through B-7. These tags enable you to mark μops at earlier stage of execution <strong>and</strong>count the μops at retirement using the replay_event. These tags require at least twoMSR’s (see Table B-8, column 2 <strong>and</strong> column 3) to tag the μops so they can bedetected at retirement. Some tags require additional MSR (see Table B-8, column 4)to select the event types for these tagged μops. The event names referenced incolumn 4 are those from the Pentium 4 processor performance monitoring events(Section B.2).B-35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!