13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

OPTIMIZING FOR SIMD INTEGER APPLICATIONS63MM/m<strong>64</strong>0X8 X7 X6 X5 X4 X3 X2 X163- - - - - - - -MM063Y8 Y7 Y6 Y5 Y4 Y3 Y2 Y1= = = = = = = =TempT8 T7 T6 T5 T4 T3 T2 T1063MM00..0 0..0 0..0 T1+T2+T3+T4+T5+T6+T7+T8OM15167Figure 5-9. PSADBW Instruction ExampleThe subtraction operation presented above is an absolute difference. That is,T = ABS(X-Y). Byte values are stored in temporary space, all values are summedtogether, <strong>and</strong> the result is written to the lower word of the destination register.5.6.10 Packed Average (Byte/Word)The PAVGB <strong>and</strong> PAVGW instructions add the unsigned data elements of the sourceoper<strong>and</strong> to the unsigned data elements of the destination register, along with a carryin.The results of the addition are then independently shifted to the right by one bitposition. The high order bits of each element are filled with the carry bits of the correspondingsum.The destination oper<strong>and</strong> is an SIMD register. The source oper<strong>and</strong> can either be anSIMD register or a memory oper<strong>and</strong>.The PAVGB instruction operates on packed unsigned bytes <strong>and</strong> the PAVGW instructionoperates on packed unsigned words.5-29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!