26.11.2012 Views

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />

The -O3 switch turns <strong>on</strong> several general optimizati<strong>on</strong>s.<br />

32035 Rev. 3.22 November 2007<br />

Table 6. Recommended Opti<strong>on</strong> Switches <str<strong>on</strong>g>for</str<strong>on</strong>g> 32-<str<strong>on</strong>g>Bit</str<strong>on</strong>g> GCC <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ®<br />

SuSE GCC 4.2.0<br />

(<str<strong>on</strong>g>for</str<strong>on</strong>g> C/C++ and Fortran) and<br />

Red Hat gcc-ssa<br />

(<str<strong>on</strong>g>for</str<strong>on</strong>g> C/C++ and Fortran)<br />

Using the -ffast-math switch allows the compiler to use a significantly fast floating point model.<br />

The -fomit-frame-pointer causes the frame pointer to be omitted resulting in a per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />

improvement. The user should not use this switch if they need to rewind the stack using the frame<br />

pointer.<br />

Using -malign-double will result in better alignment and hence faster code <strong>on</strong> the AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g><br />

and AMD Opter<strong>on</strong> processors.<br />

Using -mfpmath=sse causes the compiler to generate SSE/SSE2 instructi<strong>on</strong>s in favor of the default<br />

x87 instructi<strong>on</strong>s.<br />

Since the default <str<strong>on</strong>g>for</str<strong>on</strong>g> the 32-bit gcc compiler is -march=i386, using -march=k8 causes it to generate<br />

high-per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance code <str<strong>on</strong>g>for</str<strong>on</strong>g> the AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g> and AMD Opter<strong>on</strong> processors, while using<br />

-march=amdfam10 causes it to generate high-per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance code <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Family 10h processors.<br />

The GCC 4.2.0 compiler can per<str<strong>on</strong>g>for</str<strong>on</strong>g>m loop vectorizati<strong>on</strong> by using the -ftree-vectorize flag.<br />

3.8.4 Other Switches<br />

-O3 -march=k8 -ffast-math -fomit-frame-pointer<br />

-malign-double -mfpmath=sse<br />

FSF GCC 4.2.0 Red Hat GCC 3.4.1 -O3 -march=k8 -ffast-math -fomit-frame-pointer<br />

-malign-double -mfpmath=sse -fpeel-loops -ftracer<br />

-funswitch-loops -funit-at-a-time<br />

SuSE GCC 4.2.0 -O3 -march=k8 -ffast-math -fomit-frame-pointer<br />

-malign-double -mfpmath=sse -fpeel-loops<br />

FSF GCC 4.2.0 (<str<strong>on</strong>g>for</str<strong>on</strong>g> C/C++ and<br />

Fortran)<br />

-O3 -march=k8 -ffast-math -fomit-frame-pointer<br />

-malign-double -mfpmath=sse -fpeel-loops -ftracer<br />

-funswitch-loops -ftree-vectorize<br />

In additi<strong>on</strong> to the switches menti<strong>on</strong>ed in Table 6, “Recommended Opti<strong>on</strong> Switches <str<strong>on</strong>g>for</str<strong>on</strong>g> 32-<str<strong>on</strong>g>Bit</str<strong>on</strong>g> GCC<br />

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ® ,” <strong>on</strong> page 31 the following list of switches may also improve the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />

of the program. It is worth experimenting with these switches.<br />

Profile Guided Optimizati<strong>on</strong>. The 32-bit GCC compiler allows profile guided optimizati<strong>on</strong>.<br />

Table 7 shows the profile guided optimizati<strong>on</strong> switches <str<strong>on</strong>g>for</str<strong>on</strong>g> the three GCC compilers.<br />

Table 7. Profile Guided Optimizati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> 32-<str<strong>on</strong>g>Bit</str<strong>on</strong>g> GCC <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> Linux ®<br />

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Versi<strong>on</strong> Optimizati<strong>on</strong> Switches<br />

32 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!