13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

GENERAL OPTIMIZATION GUIDELINEStranscendental performance with these techniques by choosing the desired numericprecision <strong>and</strong> the size of the look-up table, <strong>and</strong> by taking advantage of theparallelism of the SSE <strong>and</strong> the SSE2 instructions.3.8.2 Floating-point Modes <strong>and</strong> ExceptionsWhen working with floating-point numbers, high-speed microprocessors frequentlymust deal with situations that need special h<strong>and</strong>ling in hardware or code.3.8.2.1 Floating-point ExceptionsThe most frequent cause of performance degradation is the use of masked floatingpointexception conditions such as:• arithmetic overflow• arithmetic underflow• denormalized oper<strong>and</strong>Refer to Chapter 4 of Intel® <strong>64</strong> <strong>and</strong> <strong>IA</strong>-<strong>32</strong> <strong>Architectures</strong> Software Developer’s<strong>Manual</strong>, Volume 1, for definitions of overflow, underflow <strong>and</strong> denormal exceptions.Denormalized floating-point numbers impact performance in two ways:• directly when are used as oper<strong>and</strong>s• indirectly when are produced as a result of an underflow situationIf a floating-point application never underflows, the denormals can only come fromfloating-point constants.User/Source Coding Rule 19. (H impact, ML generality) Denormalizedfloating-point constants should be avoided as much as possible.Denormal <strong>and</strong> arithmetic underflow exceptions can occur during the execution of x87instructions or SSE/SSE2/SSE3 instructions. Processors based on Intel NetBurstmicroarchitecture h<strong>and</strong>le these exceptions more efficiently when executingSSE/SSE2/SSE3 instructions <strong>and</strong> when speed is more important than complying withthe IEEE st<strong>and</strong>ard. The following paragraphs give recommendations on how to optimizeyour code to reduce performance degradations related to floating-point exceptions.3.8.2.2 Dealing with floating-point exceptions in x87 FPU codeEvery special situation listed in Section 3.8.2.1, “Floating-point Exceptions,” is costlyin terms of performance. For that reason, x87 FPU code should be written to avoidthese situations.3-79

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!