13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING FOR SIMD INTEGER APPLICATIONSTwo signed doublewords are used as source oper<strong>and</strong>s <strong>and</strong> the result is interleavedsigned words. The sequence in Example 5-6 can be extended in SSE2 to interleaveeight signed words using XMM registers.MM/M<strong>64</strong>mmD C B AD 1B 1C 1A 1mmOM15160Figure 5-2. Interleaved Pack with SaturationExample 5-6. Interleaved Pack with Saturation Code; Input:MM0 signed source1 value; MM1 signed source2 value; Output:MM0 the first <strong>and</strong> third words contain the; signed-saturated doublewords from MM0,; the second <strong>and</strong> fourth words contain; signed-saturated doublewords from MM1;packssdw mm0, mm0 ; pack <strong>and</strong> sign saturatepackssdw mm1, mm1 ; pack <strong>and</strong> sign saturatepunpcklwd mm0, mm1 ; interleave the low-end 16-bit; values of the oper<strong>and</strong>sPack instructions always assume that source oper<strong>and</strong>s are signed numbers. Theresult in the destination register is always defined by the pack instruction thatperforms the operation. For example, PACKSSDW packs each of two signed <strong>32</strong>-bitvalues of two sources into four saturated 16-bit signed values in a destinationregister. PACKUSWB, on the other h<strong>and</strong>, packs the four signed 16-bit values of twosources into eight saturated eight-bit unsigned values in the destination.5-9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!