13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

OPTIMIZING FOR SIMD INTEGER APPLICATIONSvalues. In order for software to operate correctly, the x87 floating-point stack shouldbe emptied when starting a series of x87 floating-point calculations after operatingon the MMX registers.Using EMMS clears all valid bits, effectively emptying the x87 floating-point stack <strong>and</strong>making it ready for new x87 floating-point operations. The EMMS instruction ensuresa clean transition between using operations on the MMX registers <strong>and</strong> using operationson the x87 floating-point stack. On the Pentium 4 processor, there is a finiteoverhead for using the EMMS instruction.Failure to use the EMMS instruction (or the _MM_EMPTY() intrinsic) between operationson the MMX registers <strong>and</strong> x87 floating-point registers may lead to unexpectedresults.NOTEFailure to reset the tag word for FP instructions after using an MMXinstruction can result in faulty execution or poor performance.5.2.2 Guidelines for Using EMMS InstructionWhen developing code with both x87 floating-point <strong>and</strong> <strong>64</strong>-bit SIMD integer instructions,follow these steps:1. Always call the EMMS instruction at the end of <strong>64</strong>-bit SIMD integer code when thecode transitions to x87 floating-point code.2. Insert the EMMS instruction at the end of all <strong>64</strong>-bit SIMD integer code segmentsto avoid an x87 floating-point stack overflow exception when an x87 floatingpointinstruction is executed.When writing an application that uses both floating-point <strong>and</strong> <strong>64</strong>-bit SIMD integerinstructions, use the following guidelines to help you determine when to use EMMS:• If next instruction is x87 FP — Use _MM_EMPTY() after a <strong>64</strong>-bit SIMD integerinstruction if the next instruction is an X87 FP instruction; for example, beforedoing calculations on floats, doubles or long doubles.• Don’t empty when already empty — If the next instruction uses an MMXregister, _MM_EMPTY() incurs a cost with no benefit.• Group Instructions — Try to partition regions that use X87 FP instructions fromthose that use <strong>64</strong>-bit SIMD integer instructions. This eliminates the need for anEMMS instruction within the body of a critical loop.5-3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!