13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

INTEL® <strong>64</strong> AND <strong>IA</strong>-<strong>32</strong> PROCESSOR ARCHITECTURESStore buffers improve performance by allowing the processor to continue executinginstructions without having to wait until a write to memory <strong>and</strong>/or cache is complete.Writes are generally not on the critical path for dependence chains, so it is oftenbeneficial to delay writes for more efficient use of memory-access bus cycles.2.2.4.6 Store ForwardingLoads can be moved before stores that occurred earlier in the program if they are notpredicted to load from the same linear address. If they do read from the same linearaddress, they have to wait for the store data to become available. However, withstore forwarding, they do not have to wait for the store to write to the memory hierarchy<strong>and</strong> retire. The data from the store can be forwarded directly to the load, aslong as the following conditions are met:• Sequence — Data to be forwarded to the load has been generated by a programmatically-earlierstore which has already executed.• Size — Bytes loaded must be a subset of (including a proper subset, that is, thesame) bytes stored.• Alignment — The store cannot wrap around a cache line boundary, <strong>and</strong> thelinear address of the load must be the same as that of the store.2.3 INTEL ® PENTIUM ® M PROCESSORMICROARCHITECTURELike the Intel NetBurst microarchitecture, the pipeline of the Intel Pentium Mprocessor microarchitecture contains three sections:• in-order issue front end• out-of-order superscalar execution core• in-order retirement unitIntel Pentium M processor microarchitecture supports a high-speed system bus (upto 533 MHz) with <strong>64</strong>-byte line size. Most coding recommendations that apply to theIntel NetBurst microarchitecture also apply to the Intel Pentium M processor.The Intel Pentium M processor microarchitecture is designed for lower powerconsumption. There are other specific areas of the Pentium M processor microarchitecturethat differ from the Intel NetBurst microarchitecture. They are describednext. A block diagram of the Intel Pentium M processor is shown in Figure 2-6.2-<strong>32</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!