13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING FOR SIMD INTEGER APPLICATIONSExample 5-26 shows how to clip signed words to an arbitrary range; the code forclipping unsigned bytes is similar.Example 5-26. Clipping to a Signed Range of Words [High, Low]; Input:; MM0 signed source oper<strong>and</strong>s; Output:; MM0 signed words clipped to the signed; range [high, low]pminsw mm0, packed_highpmaxswmm0, packed_lowExample 5-27. Clipping to an Arbitrary Signed Range [High, Low]; Input:; MM0 signed source oper<strong>and</strong>s; Output:; MM1 signed oper<strong>and</strong>s clipped to the unsigned; range [high, low]paddw mm0, packed_min ; add with no saturation; 0x8000 to convert to unsignedpadduswmm0, (packed_usmax - high_us); in effect this clips to highpsubuswmm0, (packed_usmax - high_us + low_us); in effect this clips to lowpaddw mm0, packed_low ; undo the previous two offsetsThe code above converts values to unsigned numbers first <strong>and</strong> then clips them to anunsigned range. The last instruction converts the data back to signed data <strong>and</strong> placesthe data within the signed range.Conversion to unsigned data is required for correct results when (High - Low) = 0X8000, simplify the algorithm as in Example 5-28.5-26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!