Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />
The GCC 4.0 and later versi<strong>on</strong> compilers can per<str<strong>on</strong>g>for</str<strong>on</strong>g>m loop vectorizati<strong>on</strong> by using the<br />
-ftree-vectorize flag.<br />
3.2.4 Other Switches<br />
32035 Rev. 3.22 November 2007<br />
In additi<strong>on</strong> to the switches menti<strong>on</strong>ed in Table 3 <strong>on</strong> page 23, the following list of switches may also<br />
improve the program per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. It is worth experimenting with these switches.<br />
-march=k8. For the FSF GCC 4.2.0 SuSE 4.2.0and Red Hat 4.2.0 compilers, using this switch may<br />
give you a per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance advantage in some cases.<br />
-march=amdfam10. For applicati<strong>on</strong>s to be executed <strong>on</strong> AMD Family 10h processor-based<br />
plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms, this switch results in better per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance.<br />
Note: The amdfam10 opti<strong>on</strong> is not available <strong>on</strong> all GCC compiler releases. See your compiler<br />
documentati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> further in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong>.<br />
Profile Guided Optimizati<strong>on</strong>. The <str<strong>on</strong>g>64</str<strong>on</strong>g>-bit GCC compiler also allows profile guided optimizati<strong>on</strong>.<br />
Table 4 shows the profile guided optimizati<strong>on</strong> switches <str<strong>on</strong>g>for</str<strong>on</strong>g> the different GCC compilers.<br />
Table 4. Profile Guided Optimizati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>64</str<strong>on</strong>g>-<str<strong>on</strong>g>Bit</str<strong>on</strong>g> GCC <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ®<br />
<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Versi<strong>on</strong> Optimizati<strong>on</strong> Switches<br />
SuSE GCC 4.2.0<br />
Step 1.Compile the program with -fprofile-arcs.<br />
and<br />
Red Hat gcc-ssa<br />
Step 2.Run the executable produced in Step 1. Running the<br />
(<str<strong>on</strong>g>for</str<strong>on</strong>g> C/C++ and Fortran)<br />
executable generates several files with profile<br />
in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> (*.da).<br />
Step 3.Recompile the program with -fbranch-probabilities.<br />
FSF GCC 4.2.0 and Step 1.Compile the program with -fprofile-generate.<br />
Red Hat GCC 4.2.0<br />
Step 2.Run the executable produced in Step 1. Running the<br />
executable generates several files with profile<br />
in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> (*.da).<br />
Step 3.Recompile the program with -fprofile-use.<br />
-Bsymbolic. Sarting from GCC 4.1, gcc compiler no l<strong>on</strong>ger requires the -Bsymbolic switch. GCC<br />
4.1 and later versi<strong>on</strong>s offer -combine -fwhole -program, which should be used together, but require<br />
that makefiles be changed to use a single command to compile and link all files of an applicati<strong>on</strong>,<br />
slowing down builds. So it should <strong>on</strong>ly be used <str<strong>on</strong>g>for</str<strong>on</strong>g> n<strong>on</strong>-debug builds. Un<str<strong>on</strong>g>for</str<strong>on</strong>g>tunately, these opti<strong>on</strong>s<br />
may fail compiling some files.<br />
-minline-all-stringops. When using the GCC 3.4 compiler <strong>on</strong> Red Hat Enterprise Linux 4,<br />
experiment with the switch -minline-all-stringops. This switch is not recommended <str<strong>on</strong>g>for</str<strong>on</strong>g> GCC 3.4 <strong>on</strong><br />
SuSE Linux Enterprise Server.<br />
Linking with ACML. The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT<br />
routines that are optimized <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong>, and AMD Family 10h<br />
processors. If the program uses these routines, using ACML in place of generic C/Fortran<br />
24 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3