24.05.2014 Views

XL Fortran Enterprise Edition for AIX : User's Guide - IBM

XL Fortran Enterprise Edition for AIX : User's Guide - IBM

XL Fortran Enterprise Edition for AIX : User's Guide - IBM

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

v<br />

v<br />

v<br />

Reducing the costs of memory access through the effective use of caches and<br />

translation look-aside buffers.<br />

Overlapping computation and memory access through effective utilization of the<br />

data prefetching capabilities provided by the hardware.<br />

Improving the utilization of processor resources through reordering and<br />

balancing the usage of instructions with complementary resource requirements.<br />

-qhot=vector is the default when -qhot is specified. Compiling with -qhot=vector<br />

trans<strong>for</strong>ms some loops to exploit optimized versions of functions rather than the<br />

standard versions. The optimized functions reside in a built-in library that includes<br />

functions and operations such as reciprocal, square root, and so on. The optimized<br />

versions make different trade-offs with respect to precision versus per<strong>for</strong>mance.<br />

Usage of -qstrict implies -qhot=novector.<br />

Getting the Most out of -qhot<br />

Try using -qhot along with -O3 <strong>for</strong> all of your code. (The compiler assumes at least<br />

-O2 level <strong>for</strong> -qhot.) It is designed to have a neutral effect when no opportunities<br />

<strong>for</strong> trans<strong>for</strong>mation exist.<br />

v If you encounter unacceptably long compile times (this can happen with<br />

complex loop nests) or if your per<strong>for</strong>mance degrades with the use of -qhot, try<br />

using -qhot=novector, or -qstrict or -qcompact along with -qhot.<br />

v<br />

If necessary, deactivate -qhot selectively, allowing it to improve some of your<br />

code.<br />

Optimizing Loops and Array Language<br />

The -qhot option does the following trans<strong>for</strong>mations to improve the per<strong>for</strong>mance<br />

of loops, array language, and memory management:<br />

v Scalar replacement, loop blocking, distribution, fusion, interchange, reversal,<br />

skewing, and unrolling<br />

v Reducing generation of temporary arrays<br />

It requires at least level 2 of -O. The -C option inhibits it.<br />

If you have SMP hardware, you can enable automatic parallelization of loops by<br />

specifying the -qsmp option. This optimization includes explicitly coded DO loops<br />

as well as DO loops that are generated by the compiler <strong>for</strong> array language<br />

(WHERE, FORALL, array assignment, and so on). The compiler can only<br />

parallelize loops that are independent (each iteration can be computed<br />

independently of any other iteration). One case where the compiler will not<br />

automatically parallelize loops is where the loops contain I/O, because doing so<br />

could lead to unexpected results. In this case, by using the PARALLEL DO or<br />

work-sharing DO directive, you can advise the compiler that such a loop can be<br />

safely parallelized. However, the type of I/O must be one of the following:<br />

v Direct-access I/O where each iteration writes to or reads from a different record<br />

v Sequential I/O where each iteration writes to or reads from a different unit<br />

v Stream-access I/O where each iteration uses the POS= specifier to write to, or<br />

read from, a different part of the file.<br />

v Stream-access I/O where each iteration writes to, or reads from, a different unit.<br />

For more details, refer to the description of the PARALLEL DO or work-sharing<br />

DO directive in the <strong>XL</strong> <strong>Fortran</strong> <strong>Enterprise</strong> <strong>Edition</strong> <strong>for</strong> <strong>AIX</strong> Language Reference.<br />

You can use the -qhot and -qsmp options on:<br />

v Programs with per<strong>for</strong>mance bottlenecks that are caused by loops and structured<br />

memory accesses<br />

312 <strong>XL</strong> <strong>Fortran</strong> <strong>Enterprise</strong> <strong>Edition</strong> <strong>for</strong> <strong>AIX</strong> : User’s <strong>Guide</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!