12.07.2015 Views

PGI User's Guide

PGI User's Guide

PGI User's Guide

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Local and Global Optimization using -O28Level-one optimization specifies local optimization (–O1). The compiler performs scheduling of basic blocksas well as register allocation. Local optimization is a good choice when the code is very irregular, such as codethat contains many short statements containing IF statements and does not contain loops (DO or DO WHILEstatements). Although this case rarely occurs, for certain types of code, this optimization level may performbetter than level-two (–O2).The <strong>PGI</strong> compilers perform many different types of local optimizations, including but not limited to:- Algebraic identity removal - Peephole optimizations- Constant folding - Redundant load and store elimination- Common subexpression elimination - Strength reductions- Local register optimizationLevel-two optimization (–O2 or –O) specifies global optimization. The –fast option generally will specifyglobal optimization; however, the –fast switch varies from release to release, depending on a reasonableselection of switches for any one particular release. The –O or –O2 level performs all level-one localoptimizations as well as global optimizations. Control flow analysis is applied and global registers are allocatedfor all functions and subroutines. Loop regions are given special consideration. This optimization level is agood choice when the program contains loops, the loops are short, and the structure of the code is regular.The <strong>PGI</strong> compilers perform many different types of global optimizations, including but not limited to:- Branch to branch elimination - Global register allocation- Constant propagation - Invariant code motion- Copy propagation - Induction variable elimination- Dead store eliminationYou can explicitly select the optimization level on the command line. For example, the following command linespecifies level-two optimization which results in global optimization:$ pgfortran -O2 prog.fSpecifying –O on the command-line without a level designation is equivalent to –O2. The default optimizationlevel changes depending on which options you select on the command line. For example, when you selectthe –g debugging option, the default optimization level is set to level-zero (–O0). However, if you need todebug optimized code, you can use the -gopt option to generate debug information without perturbingoptimization. For a description of the default levels, refer to “Default Optimization Levels,” on page 44.As noted previously, the –fast option includes –O2 on all x86 and x64 targets. If you want to override thedefault for–fast with –O3 while maintaining all other elements of –fast, simply compile as follows:$ pgfortran -fast -O3 prog.fScalar SSE Code GenerationFor all processors prior to Intel Pentium 4 and AMD Opteron/Athlon64, for example Intel Pentium III andAMD AthlonXP/MP processors, scalar floating-point arithmetic as generated by the <strong>PGI</strong> Workstation compilersis performed using x87 floating-point stack instructions. With the advent of SSE/SSE2 instructions on Intel

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!