13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING FOR SIMD INTEGER APPLICATIONSExample 5-4. Zero Extend 16-bit Values into <strong>32</strong> Bits Using Unsigned Unpack InstructionsCode; Input:; XMM0 8 16-bit values in source; XMM7 0 a local variable can be used; instead of the register XMM7 if; desired.; Output:; XMM0 four zero-extended <strong>32</strong>-bit; doublewords from four low-end; words; XMM1 four zero-extended <strong>32</strong>-bit; doublewords from four high-end; wordsmovdqa xmm1, xmm0 ; copy sourcepunpcklwd xmm0, xmm7 ; unpack the 4 low-end words; into 4 <strong>32</strong>-bit doublewordpunpckhwd xmm1, xmm7 ; unpack the 4 high-end words; into 4 <strong>32</strong>-bit doublewords5.4.2 Signed UnpackSigned numbers should be sign-extended when unpacking values. This is similar tothe zero-extend shown above, except that the PSRAD instruction (packed shift rightarithmetic) is used to sign extend the values.Example 5-5 assumes the source is a packed-word (16-bit) data type.Example 5-5. Signed Unpack CodeInput:; XMM0 source value; Output:; XMM0 four sign-extended <strong>32</strong>-bit doublewords; from four low-end words; XMM1 four sign-extended <strong>32</strong>-bit doublewords; from four high-end words;5-7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!