13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

GENERAL OPTIMIZATION GUIDELINESMOVSS/SD between registersUsing these instructions creates a dependency chain between the unmodified part ofthe register <strong>and</strong> the modified part of the register. This dependency chain can causeperformance loss.Example 3-20 illustrates the use of MOVZX to avoid a partial register stall whenpacking three byte values into a register.Follow these recommendations to avoid stalls from partial updates to XMM registers:• Avoid using instructions which update only part of the XMM register.• If a <strong>64</strong>-bit load is needed, use the MOVSD or MOVQ instruction.• If 2 <strong>64</strong>-bit loads are required to the same register from non continuous locations,use MOVSD/MOVHPD instead of MOVLPD/MOVHPD.• When copying the XMM register, use the following instructions for full registercopy, even if you only want to copy some of the source register data:MOVAPSMOVAPDMOVDQAExample 3-20. Avoiding Partial Register Stalls in SIMD CodeUsing movlpd for memory transactions<strong>and</strong> movsd between register copiesCausing Partial Register Stallmov edx, xmov ecx, countmovlpd xmm3,_1_movlpd xmm2,_1pt5_align 16lp:movlpd xmm0, [edx]addsd xmm0, xmm3movsd xmm1, xmm2subsd xmm1, [edx]mulsd xmm0, xmm1movsd [edx], xmm0add edx, 8dec ecxjnz lpUsing movsd for memory <strong>and</strong> movapdbetween register copies Avoid Delaylp:mov edx, xmov ecx, countmovsd xmm3,_1_movsd xmm2, _1pt5_align 16movsd xmm0, [edx]addsd xmm0, xmm3movapd xmm1, xmm2subsd xmm1, [edx]mulsd xmm0, xmm1movsd [edx], xmm0add edx, 8dec ecxjnz lp3-36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!