Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<str<strong>on</strong>g>Compiler</str<strong>on</strong>g> <str<strong>on</strong>g>Usage</str<strong>on</strong>g> <str<strong>on</strong>g>Guidelines</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD<str<strong>on</strong>g>64</str<strong>on</strong>g> Plat<str<strong>on</strong>g>for</str<strong>on</strong>g>ms<br />
32035 Rev. 3.22 November 2007<br />
3. Recompile the program with the -fb_opt fbdata switch.<br />
Inter-Procedure Optimizati<strong>on</strong>. Use the -ipa switch to enable inter-procedure optimizati<strong>on</strong>.<br />
-Ofast. For aggressive optimizati<strong>on</strong>, use the -Ofast switch. This is the shorthand <str<strong>on</strong>g>for</str<strong>on</strong>g> the switches<br />
-O3, -OPT:Ofast, -ipa, and -fno-math-errno.<br />
Linking with ACML.<br />
The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT routines that are<br />
optimized <str<strong>on</strong>g>for</str<strong>on</strong>g> AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong> and AMD Family 10h processors. If the program<br />
uses these routines, using ACML in place of generic C/Fortran implementati<strong>on</strong> may greatly improve<br />
the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. For additi<strong>on</strong>al details <strong>on</strong> how to install this library and use it, see<br />
http://developer.amd.com/assets/acml_userguide.pdf.<br />
Refer to the PathScale EKOPath <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Suite User Guide, Versi<strong>on</strong> 2.1, <str<strong>on</strong>g>for</str<strong>on</strong>g> more opti<strong>on</strong>s and<br />
suggesti<strong>on</strong>s <str<strong>on</strong>g>for</str<strong>on</strong>g> tuning your applicati<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance.<br />
3.11 Intel <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s (32-<str<strong>on</strong>g>Bit</str<strong>on</strong>g>) <str<strong>on</strong>g>for</str<strong>on</strong>g> Microsoft ® Windows ®<br />
The 32-bit Intel compilers can be installed and run <strong>on</strong> 32-bit Microsoft Windows <strong>on</strong><br />
AMD Athl<strong>on</strong> <str<strong>on</strong>g>64</str<strong>on</strong>g>, AMD Opter<strong>on</strong> and AMD Family 10h processors.<br />
3.11.1 Invocati<strong>on</strong> Commands<br />
The following commands invoke specific compilers:<br />
icl invokes the 32-bit Intel C/C++ compilers.<br />
i<str<strong>on</strong>g>for</str<strong>on</strong>g>t invokes the 32-bit Intel Fortran versi<strong>on</strong>s 9.1 and 10.0compilers.<br />
3.11.2 Generic Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance Switches<br />
Use of the -QxW -Qipo -O3 switches are recommended <str<strong>on</strong>g>for</str<strong>on</strong>g> Intel compiler versi<strong>on</strong> 10.0.<br />
The -QxW switch instructs the compiler to optimize <str<strong>on</strong>g>for</str<strong>on</strong>g> Pentium 4 processor (including SSE2<br />
instructi<strong>on</strong>s).<br />
The -Qipo switch enables interprocedural (across multiple source files) analysis.<br />
The -O3 optimizes <str<strong>on</strong>g>for</str<strong>on</strong>g> speed and includes several aggressive optimizati<strong>on</strong>s.<br />
3.11.3 Other Switches<br />
In additi<strong>on</strong> to the switches menti<strong>on</strong>ed in the program. It is worth experimenting with these switches.<br />
Profile Guided Optimizati<strong>on</strong>. Intel compilers allow profile guided optimizati<strong>on</strong>. Use the following<br />
steps <str<strong>on</strong>g>for</str<strong>on</strong>g> profile guided optimizati<strong>on</strong> with Intel compilers.<br />
1. Compile the program with the -Qprof_gen switch. The -Qipo or -Qip switch is ignored by the<br />
compiler if used with -Qprof_gen.<br />
36 Per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-Centric <str<strong>on</strong>g>Compiler</str<strong>on</strong>g> Switches Chapter 3