26.11.2012 Views

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />

The GCC 4.0 and later versi<strong>on</strong> compilers can per<str<strong>on</strong>g>for</str<strong>on</strong>g>m loop vectorizati<strong>on</strong> by using the<br />

-ftree-vectorize flag.<br />

3.2.4 Other Switches<br />

32035 Rev. 3.22 November 2007<br />

In additi<strong>on</strong> to the switches menti<strong>on</strong>ed in Table 3 <strong>on</strong> page 23, the following list of switches may also<br />

improve the program per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. It is worth experimenting with these switches.<br />

-march=k8. For the FSF GCC 4.2.0 SuSE 4.2.0and Red Hat 4.2.0 compilers, using this switch may<br />

give you a per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance advantage in some cases.<br />

-march=amdfam10. For applicati<strong>on</strong>s to be executed <strong>on</strong> AMD Family 10h processor-based<br />

plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms, this switch results in better per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance.<br />

Note: The amdfam10 opti<strong>on</strong> is not available <strong>on</strong> all GCC compiler releases. See your compiler<br />

documentati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> further in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong>.<br />

Profile Guided Optimizati<strong>on</strong>. The <str<strong>on</strong>g>64</str<strong>on</strong>g>-bit GCC compiler also allows profile guided optimizati<strong>on</strong>.<br />

Table 4 shows the profile guided optimizati<strong>on</strong> switches <str<strong>on</strong>g>for</str<strong>on</strong>g> the different GCC compilers.<br />

Table 4. Profile Guided Optimizati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>64</str<strong>on</strong>g>-<str<strong>on</strong>g>Bit</str<strong>on</strong>g> GCC <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ®<br />

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Versi<strong>on</strong> Optimizati<strong>on</strong> Switches<br />

SuSE GCC 4.2.0<br />

Step 1.Compile the program with -fprofile-arcs.<br />

and<br />

Red Hat gcc-ssa<br />

Step 2.Run the executable produced in Step 1. Running the<br />

(<str<strong>on</strong>g>for</str<strong>on</strong>g> C/C++ and Fortran)<br />

executable generates several files with profile<br />

in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> (*.da).<br />

Step 3.Recompile the program with -fbranch-probabilities.<br />

FSF GCC 4.2.0 and Step 1.Compile the program with -fprofile-generate.<br />

Red Hat GCC 4.2.0<br />

Step 2.Run the executable produced in Step 1. Running the<br />

executable generates several files with profile<br />

in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> (*.da).<br />

Step 3.Recompile the program with -fprofile-use.<br />

-Bsymbolic. Sarting from GCC 4.1, gcc compiler no l<strong>on</strong>ger requires the -Bsymbolic switch. GCC<br />

4.1 and later versi<strong>on</strong>s offer -combine -fwhole -program, which should be used together, but require<br />

that makefiles be changed to use a single command to compile and link all files of an applicati<strong>on</strong>,<br />

slowing down builds. So it should <strong>on</strong>ly be used <str<strong>on</strong>g>for</str<strong>on</strong>g> n<strong>on</strong>-debug builds. Un<str<strong>on</strong>g>for</str<strong>on</strong>g>tunately, these opti<strong>on</strong>s<br />

may fail compiling some files.<br />

-minline-all-stringops. When using the GCC 3.4 compiler <strong>on</strong> Red Hat Enterprise Linux 4,<br />

experiment with the switch -minline-all-stringops. This switch is not recommended <str<strong>on</strong>g>for</str<strong>on</strong>g> GCC 3.4 <strong>on</strong><br />

SuSE Linux Enterprise Server.<br />

Linking with ACML. The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT<br />

routines that are optimized <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong>, and AMD Family 10h<br />

processors. If the program uses these routines, using ACML in place of generic C/Fortran<br />

24 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!