13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

GENERAL OPTIMIZATION GUIDELINESThe following rules help satisfy size <strong>and</strong> alignment restrictions for store forwarding:Assembly/Compiler Coding Rule 47. (H impact, M generality) A load thatforwards from a store must have the same address start point <strong>and</strong> therefore thesame alignment as the store data.Assembly/Compiler Coding Rule 48. (H impact, M generality) The data of aload which is forwarded from a store must be completely contained within the storedata.A load that forwards from a store must wait for the store’s data to be written to thestore buffer before proceeding, but other, unrelated loads need not wait.Assembly/Compiler Coding Rule 49. (H impact, ML generality) If it isnecessary to extract a non-aligned portion of stored data, read out the smallestaligned portion that completely contains the data <strong>and</strong> shift/mask the data asnecessary. This is better than incurring the penalties of a failed store-forward.Assembly/Compiler Coding Rule 50. (MH impact, ML generality) Avoidseveral small loads after large stores to the same area of memory by using a singlelarge read <strong>and</strong> register copies as needed.Example 3-29 depicts several store-forwarding situations in which small loads followlarge stores. The first three load operations illustrate the situations described in Rule50. However, the last load operation gets data from store-forwarding withoutproblem.Example 3-29. Situations Showing Small Loads After Large Storemov [EBP],‘abcd’mov AL, [EBP]mov BL, [EBP + 1]mov CL, [EBP + 2]mov DL, [EBP + 3]mov AL, [EBP]; Not blocked - same alignment; Blocked; Blocked; Blocked; Not blocked - same alignment; n.b. passes older blocked loadsExample 3-30 illustrates a store-forwarding situation in which a large load followsseveral small stores. The data needed by the load operation cannot be forwardedbecause all of the data that needs to be forwarded is not contained in the store buffer.Avoid large loads after small stores to the same area of memory.3-52

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!