26.11.2012 Views

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />

32035 Rev. 3.22 November 2007<br />

3. Recompile the program with the -fb_opt fbdata switch.<br />

Inter-Procedure Optimizati<strong>on</strong>. Use the -ipa switch to enable inter-procedure optimizati<strong>on</strong>.<br />

-Ofast. For aggressive optimizati<strong>on</strong>, use the -Ofast switch. This is the shorthand <str<strong>on</strong>g>for</str<strong>on</strong>g> the switches<br />

-O3, -OPT:Ofast, -ipa, and -fno-math-errno.<br />

Linking with ACML.<br />

The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT routines that are<br />

optimized <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong> and AMD Family 10h processors. If the program<br />

uses these routines, using ACML in place of generic C/Fortran implementati<strong>on</strong> may greatly improve<br />

the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. For additi<strong>on</strong>al details <strong>on</strong> how to install this library and use it, see<br />

http://developer.amd.com/assets/acml_userguide.pdf.<br />

Refer to the PathScale EKOPath <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Suite User Guide, Versi<strong>on</strong> 2.1, <str<strong>on</strong>g>for</str<strong>on</strong>g> more opti<strong>on</strong>s and<br />

suggesti<strong>on</strong>s <str<strong>on</strong>g>for</str<strong>on</strong>g> tuning your applicati<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance.<br />

3.11 Intel <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s (32-<str<strong>on</strong>g>Bit</str<strong>on</strong>g>) <str<strong>on</strong>g>for</str<strong>on</strong>g> Microsoft ® Windows ®<br />

The 32-bit Intel compilers can be installed and run <strong>on</strong> 32-bit Microsoft Windows <strong>on</strong><br />

AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong> and AMD Family 10h processors.<br />

3.11.1 Invocati<strong>on</strong> Commands<br />

The following commands invoke specific compilers:<br />

icl invokes the 32-bit Intel C/C++ compilers.<br />

i<str<strong>on</strong>g>for</str<strong>on</strong>g>t invokes the 32-bit Intel Fortran versi<strong>on</strong>s 9.1 and 10.0compilers.<br />

3.11.2 Generic Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance Switches<br />

Use of the -QxW -Qipo -O3 switches are recommended <str<strong>on</strong>g>for</str<strong>on</strong>g> Intel compiler versi<strong>on</strong> 10.0.<br />

The -QxW switch instructs the compiler to optimize <str<strong>on</strong>g>for</str<strong>on</strong>g> Pentium 4 processor (including SSE2<br />

instructi<strong>on</strong>s).<br />

The -Qipo switch enables interprocedural (across multiple source files) analysis.<br />

The -O3 optimizes <str<strong>on</strong>g>for</str<strong>on</strong>g> speed and includes several aggressive optimizati<strong>on</strong>s.<br />

3.11.3 Other Switches<br />

In additi<strong>on</strong> to the switches menti<strong>on</strong>ed in the program. It is worth experimenting with these switches.<br />

Profile Guided Optimizati<strong>on</strong>. Intel compilers allow profile guided optimizati<strong>on</strong>. Use the following<br />

steps <str<strong>on</strong>g>for</str<strong>on</strong>g> profile guided optimizati<strong>on</strong> with Intel compilers.<br />

1. Compile the program with the -Qprof_gen switch. The -Qipo or -Qip switch is ignored by the<br />

compiler if used with -Qprof_gen.<br />

36 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!