13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

GENERAL OPTIMIZATION GUIDELINESAs an additional example, consider the cases in Example 3-33.Example 3-33. Large <strong>and</strong> Small Load Stalls; A. Large load stallmov mem, eax ; Store dword to address “MEM"mov mem + 4, ebx ; Store dword to address “MEM + 4"fld mem ; Load qword at address “MEM", stalls; B. Small Load stallfstp mem; Store qword to address “MEM"mov bx, mem+2 ; Load word at address “MEM + 2", stallsmov cx, mem+4 ; Load word at address “MEM + 4", stallsIn the first case (A), there is a large load after a series of small stores to the samearea of memory (beginning at memory address MEM). The large load will stall.The FLD must wait for the stores to write to memory before it can access all the datait requires. This stall can also occur with other data types (for example, when bytesor words are stored <strong>and</strong> then words or doublewords are read from the same area ofmemory).In the second case (B), there is a series of small loads after a large store to the samearea of memory (beginning at memory address MEM). The small loads will stall.The word loads must wait for the quadword store to write to memory before they canaccess the data they require. This stall can also occur with other data types (forexample, when doublewords or words are stored <strong>and</strong> then words or bytes are readfrom the same area of memory). This can be avoided by moving the store as far fromthe loads as possible.Store forwarding restrictions for processors based on Intel Core microarchitecture islisted in Table 3-1.StoreAlignmentTable 3-1. Store Forwarding Restrictions of ProcessorsBased on Intel Core MicroarchitectureWidth ofStore(bits)Load Alignment(byte)Width ofLoad (bits)StoreForwardingRestrictionTo Natural size 16 word aligned 8, 16 not stalledTo Natural size 16 not word aligned 8 stalledTo Natural size <strong>32</strong> dword aligned 8, <strong>32</strong> not stalledTo Natural size <strong>32</strong> not dword aligned 8 stalledTo Natural size <strong>32</strong> word aligned 16 not stalledTo Natural size <strong>32</strong> not word aligned 16 stalledTo Natural size <strong>64</strong> qword aligned 8, 16, <strong>64</strong> not stalled3-54

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!