13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

OPTIMIZING FOR SIMD INTEGER APPLICATIONS5.4.10 Packed Shuffle Word for 128-bit RegistersThe PSHUFLW/PSHUFHW instruction performs a full shuffle of any source word fieldwithin the low/high <strong>64</strong> bits to any result word field in the low/high <strong>64</strong> bits, using an8-bit immediate oper<strong>and</strong>; other high/low <strong>64</strong> bits are passed through from the sourceoper<strong>and</strong>.PSHUFD performs a full shuffle of any double-word field within the 128-bit source toany double-word field in the 128-bit result, using an 8-bit immediate oper<strong>and</strong>.No more than 3 instructions, using PSHUFLW/PSHUFHW/PSHUFD, are required toimplement many common data shuffling operations. Broadcast, Swap, <strong>and</strong> Reverseare illustrated in Example 5-14, Example 5-15, <strong>and</strong> Example 5-16.Example 5-14. Broadcast Code, Using 2 Instructions/* Goal: Broadcast the value from word 5 to all words *//* Instruction Result */| 7| 6| 5| 4| 3| 2| 1| 0|PSHUFHW (3,2,1,1)| 7| 6| 5| 5| 3| 2| 1| 0|PSHUFD (2,2,2,2) | 5| 5| 5| 5| 5| 5| 5| 5|Example 5-15. Swap Code, Using 3 Instructions/* Goal: Swap the values in word 6 <strong>and</strong> word 1 *//* Instruction Result */| 7| 6| 5| 4| 3| 2| 1| 0|PSHUFD (3,0,1,2) | 7| 6| 1| 0| 3| 2| 5| 4|PSHUFHW (3,1,2,0)| 7| 1| 6| 0| 3| 2| 5| 4|PSHUFD (3,0,1,2) | 7| 1| 5| 4| 3| 2| 6| 0|5-17

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!