13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

OPTIMIZING FOR SIMD INTEGER APPLICATIONSExample 5-22. Color Pixel Format Conversion Using SSE2 (Contd.)//repeats for another 3*16 bytes…add esi, <strong>64</strong>add edi, <strong>64</strong>sub ecx, 1jnz convert16PixsExample 5-23. Color Pixel Format Conversion Using SSSE3; Optimized for SSSE3mov esi, srcmov edi, destmov ecx, iterationsmovdqa xmm0, _shufb// xmm0 = [15,12,13,14,11,8,9,10,7,4,5,6,3,0,1,2]mov eax, remainderconvert16Pixs: // 16 pixels, <strong>64</strong> byte per iterationmovdqa xmm1, [esi]// xmm1 = [r3g3b3a3,r2g2b2a2,r1g1b1a1,r0g0b0a0]movdqa xmm2, [esi+16]pshufb xmm1, xmm0// xmm1 = [b3g3r3a3,b2g2r2a2,b1g1r1a1,b0g0r0a0]movdqa [edi], xmm1//repeats for another 3*16 bytes…add esi, <strong>64</strong>add edi, <strong>64</strong>sub ecx, 1jnz convert16Pixs5.6.5 Endian ConversionThe PSHUFB instruction can also be used to reverse byte ordering within a doubleword.It is more efficient than traditional techniques, such as BSWAP.5-23

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!