26.11.2012 Views

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />

32035 Rev. 3.22 November 2007<br />

-fno-rtti. Using this switch instructs the C++ compiler to discard C++ run-time type in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong><br />

(RTTI). This may improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. However, C++ features requiring RTTI (excepti<strong>on</strong>s,<br />

dynamic cast, etc.) will not be supported.<br />

-ansi-alias. Try this switch if the program strictly c<strong>on</strong><str<strong>on</strong>g>for</str<strong>on</strong>g>ms to the ISO C99 standard. If the<br />

program adheres to the standard, this switch allows the compiler to per<str<strong>on</strong>g>for</str<strong>on</strong>g>m aggressive optimizati<strong>on</strong>s.<br />

3.4 PathScale <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s (<str<strong>on</strong>g>64</str<strong>on</strong>g>-<str<strong>on</strong>g>Bit</str<strong>on</strong>g>) <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ®<br />

PathScale provides C, C++, and Fortran compilers <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> architecture-based systems running<br />

the Linux operating system. The current versi<strong>on</strong> (as of August 2007) is 3.0 All opti<strong>on</strong>s described in<br />

this secti<strong>on</strong> apply to this versi<strong>on</strong>.<br />

3.4.1 Invocati<strong>on</strong> Commands<br />

The following commands invoke specific compilers:<br />

pathcc invokes the QLogic PathScale C compiler.<br />

pathCC invokes the QLogic PathScale C++ compiler.<br />

pathf95 invokes the QLogic PathScale Fortran compiler.<br />

3.4.2 Generic Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance Switches<br />

The -O3 and -OPT:Ofast switches are recommended as the first step of optimizati<strong>on</strong>. For further<br />

tuning, experiment with the switches in the next secti<strong>on</strong>.<br />

3.4.3 Other Switches<br />

In additi<strong>on</strong> to the -O3 and -OPT:Ofast switches, the following list of switches may improve the<br />

per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the program. It is worth experimenting with these switches.<br />

Profile Guided Optimizati<strong>on</strong>. The <str<strong>on</strong>g>64</str<strong>on</strong>g>-bit QLogic PathScale compiler allows profile guided<br />

optimizati<strong>on</strong>. Use the following steps <str<strong>on</strong>g>for</str<strong>on</strong>g> profile guided optimizati<strong>on</strong> with <str<strong>on</strong>g>64</str<strong>on</strong>g>-bit PathScale compilers<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> Linux.<br />

1. Compile the program with the -fb_create fbdata switch.<br />

2. Run the executable produced in Step 1. It will generate several files with profile in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong>.<br />

3. Recompile the program with the -fb_opt fbdata switch.<br />

Inter-Procedure Optimizati<strong>on</strong>. Use the switch -ipa to enable inter-procedure optimizati<strong>on</strong>.<br />

-Ofast. For aggressive optimizati<strong>on</strong>, use the -Ofast switch. This is the shorthand <str<strong>on</strong>g>for</str<strong>on</strong>g> the switches<br />

-O3, -OPT:Ofast, -ipa -ffast-math, and -fno-math-errno .<br />

Linking with ACML. The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT<br />

routines that are optimized <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g> and AMD Opter<strong>on</strong> processors. If the program<br />

26 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!