13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING FOR SIMD INTEGER APPLICATIONS5.6.3 Absolute ValueExample 5-20 show an MMX code sequence to compute |X|, where X is signed. Thisexample assumes signed words to be the oper<strong>and</strong>s.With SSSE3, this sequence of three instructions can be replaced by the PABSWinstruction. Additionally, SSSE3 provides a 128-bit version using XMM registers <strong>and</strong>supports byte, word <strong>and</strong> doubleword granularity.Example 5-20. Computing Absolute Value; Input:; MM0 signed source oper<strong>and</strong>; Output:; MM1 ABS(MMO)pxor mm1, mm1 ; set mm1 to all zerospsubw mm1, mm0 ; make each mm1 word contain the; negative of each mm0 wordpmaxswmm1, mm0 ; mm1 will contain only the positive; (larger) values - the absolute valueNOTEThe absolute value of the most negative number (that is, 8000H for16-bit) cannot be represented using positive numbers. This algorithmwill return the original value for the absolute value (8000H).5.6.4 Pixel Format ConversionSSSE3 provides the PSHUFB instruction to carry out byte manipulation within a16-byte range. PSHUFB can replace a set of up to 12 other instruction, includingSHIFT, OR, AND <strong>and</strong> MOV.Use PSHUFB if the alternative code uses 5 or more instructions. Example 5-21 showsthe basic form of conversion of color pixel formats.Example 5-21. Basic C Implementation of RGBA to BGRA ConversionSt<strong>and</strong>ard C Code:struct RGBA{BYTE r,g,b,a;};struct BGRA{BYTE b,g,r,a;};5-21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!