XL Fortran Enterprise Edition for AIX : User's Guide - IBM

More documents

Recommendations

Info

v -qcache=auto Optimization level -O5 includes: v Everything from -O4 v -qipa=level=2 If -O5 is specified on the compile step, then it should be specified on the link step, as well. Although the -qipa option is not strictly another optimization level, it extends the optimizations across procedures (even if the procedures are in different files). It enhances the effectiveness of the optimizations that are done by other optimization options, particularly -O (at any level) and -Q. Because it can also increase compile time substantially, you may want to use it primarily for tuning applications that are already debugged and ready to be used. If your application contains a mixture of Fortran and C or C++ code compiled with IBM XL C/C+ compilers, you can achieve additional optimization by compiling and linking all your code with the -O5 option. Optimizing for a Target Machine or Class of Machines Target machine options are options that instruct the compiler to generate code for optimal execution on a given processor or architecture family. By default, the compiler generates code that runs on all supported systems, but perhaps suboptimally on a given system. By selecting appropriate target machine options, you can optimize your application to suit the broadest possible selection of target processors, a range of processors within a given family, or a specific processor. The following compiler options control optimizations affecting individual aspects of the target machine. Target machine options Option Behavior -q32 Generates code for a 32-bit addressing model (32-bit execution mode). -q64 Generates code for a 64-bit addressing model (64-bit execution mode). -qarch -qtune -qcache Selects a family of processor architectures, or a specific architecture, for which instruction code should be generated. Biases optimization toward execution on a given processor, without implying anything about the instruction set architecture to use as a target. Defines a specific cache or memory geometry. The defaults are set through -qtune. Selecting a predefined optimization level sets default values for these individual options. Related Information: See “-qarch Option” on page 127, “-qtune Option” on page 251, “-qcache Option” on page 137, and “Compiling for Specific Architectures” on page 39. Getting the Most out of Target Machine Options Try to specify with -qarch the smallest family of machines possible that will be expected to run your code reasonably well. 310 XL Fortran Enterprise Edition for AIX : User’s Guide
v -qarch=auto generates code that may take advantage of instructions available only on the compiling machine (or similar machines). v To get sqrt optimization, you need to specify -qarch=ppc64grsq or another -qarch option that supports the square root instruction set. v Specifying a -qarch option that is not compatible with your hardware, even though your program appears to work, may cause undefined behaviour; the compiler may emit instructions not available on that hardware. Try to specify with -qtune the machine where performance should be best. If you are not sure, let the compiler determine how to best tune for optimization for a given -qarch setting. Before using the -qcache option, look at the options sections of the listing using -qlist to see if the current settings are satisfactory. The settings appear in the listing itself when the -qlistopt option is specified. Modification of cache geometry may be useful in cases where the systems have configurable L2 or L3 cache options or where the execution mode reduces the effective size of a shared level of cache (for example, two-core-per-chip SMP execution on POWER4). If you decide to use -qcache, use -qhot or -qsmp along with it. Optimizing Floating-Point Calculations Special compiler options exist for handling floating-point calculations efficiently. By default, the compiler makes a trade-off to violate certain IEEE floating-point rules in order to improve performance. For example, multiply-add instructions are generated by default because they are faster and produce a more precise result than separate multiply and add instructions. Floating-point exceptions, such as overflow or division by zero, are masked by default. If you need to catch these exceptions, you have the choice of enabling hardware trapping of these exceptions or using software-based checking. The option -qflttrap enables software-based checking. On the POWER4, POWER5, or PowerPC 970 processor, hardware trapping is recommended. Options for handling floating-point calcluations Option -qfloat -qflttrap Description Provides precise control over the handling of floating-point calculations. Enables software checking of IEEE floating-point exceptions. This technique is sometimes more efficient than hardware checking because checks can be executed less frequently. To understand the performance considerations for floating-point calculations with different combinations of compiler options, see “Maximizing Floating-Point Performance” on page 295 and “Minimizing the Performance Impact of Floating-Point Exception Trapping” on page 302. High-order Transformations (-qhot) High-order transformations are optimizations that specifically improve the performance of loops and array language. Optimization techniques can include interchange, fusion, and unrolling of loops, and reducing the generation of temporary arrays. The goals of these optimizations include: Optimizing XL Fortran Programs 311
Page 1:
XL Fortran Enterprise Edition for A
Page 4 and 5:
Note! Before using this information
Page 6 and 7:
Options That Control Listings and M
Page 8 and 9:
How XLF I/O Interacts with Pipes, S
Page 10 and 11:
viii XL Fortran Enterprise Edition
Page 12 and 13:
x XL Fortran Enterprise Edition for
Page 14 and 15:
The following performance-related d
Page 16 and 17:
How to Read the Syntax Diagrams and
Page 18 and 19:
v Optionally, enter the value of at
Page 20 and 21:
6 XL Fortran Enterprise Edition for
Page 22 and 23:
Migration Support The XL Fortran co
Page 24 and 25:
These items are located, or accesse
Page 26 and 27:
v /usr/share/man/info/en_US/xlf/pos
Page 28 and 29:
Notes: 1. Specifying the LC_ALL env
Page 30 and 31:
3. Installing a compiler program te
Page 32 and 33:
fsuffix cppsuffix osuffix ssuffix l
Page 34 and 35:
mcrt = /lib/mcrt0_r.o gcrt = /lib/g
Page 36 and 37:
options include -L/usr/lpp/xlf/lib,
Page 38 and 39:
Related Information: You can use th
Page 40 and 41:
Table 1. Potential Problems Migrati
Page 42 and 43:
Table 1. Potential Problems Migrati
Page 44 and 45:
►► xlf90 xlf90_r xlf90_r7 xlf95
Page 46 and 47:
xlf90, xlf90_r, and xlf90_r7 comman
Page 48 and 49:
Related Information: The ssuffix at
Page 50 and 51:
Related Information: See “-qpdf O
Page 52 and 53:
The @PROCESS compiler directive mus
Page 54 and 55:
Passing Fortran Files through the C
Page 56 and 57:
the trigraph sequence. XL Fortran u
Page 58 and 59:
The configuration file /etc/xlf.cfg
Page 60 and 61:
esulting program work as you expect
Page 62 and 63:
v v Do not use names that are the s
Page 64 and 65:
v v Build the application by using
Page 66 and 67:
If you have applications in which F
Page 68 and 69:
total_usertime Returns the total us
Page 70 and 71:
To obtain support for items that ar
Page 72 and 73:
Note: You may need the old setting
Page 74 and 75:
These partitions are then assigned
Page 76 and 77:
During the busy-wait search for wor
Page 78 and 79:
(for example, loops for which you s
Page 80 and 81:
Other Environment Variables That Af
Page 82 and 83:
Options That Control Input to the C
Page 84 and 85:
Table 3. Options That Control Input
Page 86 and 87:
Table 5. Options for Performance Op
Page 88 and 89:
Table 5. Options for Performance Op
Page 90 and 91:
Table 6. Options for Debugging and
Page 92 and 93:
Table 7. Options That Control Listi
Page 94 and 95:
Table 8. Options for Compatibility
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
v v Determine names and options for
Page 104 and 105:
Detailed Descriptions of the XL For
Page 106 and 107:
-1 Option Syntax -1 ONETRIP | NOONE
Page 108 and 109:
-b64 Option Syntax -b64 The AIX ope
Page 110 and 111:
Examples xlf95 f.f -brtl -bshared -
Page 112 and 113:
-bloadmap Option Syntax -bloadmap:n
Page 114 and 115:
-brtl Option Syntax -brtl | -bnortl
Page 116 and 117:
-bstatic Option Related Information
Page 118 and 119:
-c Option Syntax -c Prevents the co
Page 120 and 121:
-d Option Syntax -d Causes preproce
Page 122 and 123:
-g Option Syntax -g DBG | NODBG Gen
Page 124 and 125:
-k Option Syntax -k FREE(F90) Speci
Page 126 and 127:
-l Option Syntax -lkey Searches the
Page 128 and 129:
-O Option Syntax -O[level] OPTimize
Page 130 and 131:
-o Option Syntax -o name Specifies
Page 132 and 133:
-p Option Syntax -p[g] Sets up the
Page 134 and 135:
-q32 Option Related Information See
Page 136 and 137:
-qalias Option Syntax -qalias={[no]
Page 138 and 139:
equivalence (a, b(3)) t = b; a = t
Page 140 and 141:
a derived type are stored with suff
Page 142 and 143:
pwr2s p2sc You can run the executab
Page 144 and 145:
v v If your primary concern is exec
Page 146 and 147:
-qassert Option Syntax -qassert={de
Page 148 and 149:
-qautodbl Option Syntax -qautodbl=s
Page 150 and 151:
Related Information For background
Page 152 and 153:
2 Level-2 cache or the table lookas
Page 154 and 155:
-qcheck Option Syntax -qcheck | -qn
Page 156 and 157:
-qcompact Option Syntax -qcompact |
Page 158 and 159:
-qctyplss Option Syntax -qctyplss[(
Page 160 and 161:
-qdbg Option Syntax -qdbg | -qnodbg
Page 162 and 163:
-qdirective Option Syntax -qdirecti
Page 164 and 165:
-qdirectstorage Option Syntax -qdir
Page 166 and 167:
-qdpc Option Syntax -qdpc[=e] | -qn
Page 168 and 169:
-qescape Option Syntax -qescape | -
Page 170 and 171:
-qextchk Option Syntax -qextchk | -
Page 172 and 173:
-qextname Option Syntax -qextname[=
Page 174 and 175:
-qfdpr Option Syntax -qfdpr | -qnof
Page 176 and 177:
-qflag Option Syntax -qflag=listing
Page 178 and 179:
the reciprocal of the divisor. It a
Page 180 and 181:
export SQRT_EXCEPTION=3.1 166 XL Fo
Page 182 and 183:
-qfree Option Syntax -qfree[={f90|i
Page 184 and 185:
-qhalt Option Syntax -qhalt=severit
Page 186 and 187:
If you specify -qhot=novector, the
Page 188 and 189:
-qhssngl Option Syntax -qhssngl | -
Page 190 and 191:
-qinit Option Syntax -qinit=f90ptr
Page 192 and 193:
v You can specify alphabetic digits
Page 194 and 195:
-qintsize Option Syntax -qintsize=b
Page 196 and 197:
-qipa Option Syntax -qipa[=suboptio
Page 198 and 199:
If you specify short, the Object Fi
Page 200 and 201:
Table 15. Regular expression syntax
Page 202 and 203:
-qkeepparm Option Syntax -qkeepparm
Page 204 and 205:
Restrictions The -qflag option can
Page 206 and 207:
-qlibansi Option Related Informatio
Page 208 and 209:
-qlibposix Option Related Informati
Page 210 and 211:
-qlistopt Option Syntax -qlistopt |
Page 212 and 213:
-qlog4 Option Syntax -qlog4 | -qnol
Page 214 and 215:
Restrictions Depending on the sourc
Page 216 and 217:
-qmixed Option Syntax -qmixed | -qn
Page 218 and 219:
-qmodule Option Syntax -qmodule=man
Page 220 and 221:
-qnullterm Option Syntax -qnullterm
Page 222 and 223:
-qonetrip Option Syntax -qonetrip |
Page 224 and 225:
-qpdf Option Syntax -qpdf{1|2} Tune
Page 226 and 227:
mergepdf Generates a single pdf rec
Page 228 and 229:
-qphsinfo Option Syntax -qphsinfo |
Page 230 and 231:
-qpic Option Syntax -qpic[=suboptio
Page 232 and 233:
sce | nosce } else if (a == 3) *res
Page 234 and 235:
-qprefetch Option Syntax -qprefetch
Page 236 and 237:
-qrealsize Option Syntax -qrealsize
Page 238 and 239:
-qrecur Option Syntax -qrecur | -qn
Page 240 and 241:
To produce a listing file that you
Page 242 and 243:
-qsave Option Syntax -qsave[={all|d
Page 244 and 245:
-qsclk Option Syntax -qsclk[=centi
Page 246 and 247:
-qsigtrap Option Syntax -qsigtrap[=
Page 248 and 249:
-qsmp Option Syntax -qsmp[=suboptio
Page 250 and 251:
threshold=n The work in a partition
Page 252 and 253:
xlf90 -qsmp=noopt -O3... is equival
Page 254 and 255:
-qspillsize Option Syntax -qspillsi
Page 256 and 257:
-qstrictieeemod Option Syntax -qstr
Page 258 and 259:
-qsuffix Option Syntax -qsuffix=opt
Page 260 and 261:
Examples @process nullterm i = 1; j
Page 262 and 263:
Related Information See the OpenMP
Page 264 and 265:
-qthreaded Option Syntax -qthreaded
Page 266 and 267:
If you do not specify -qtune, its s
Page 268 and 269:
-qundef Option Syntax -qundef | -qn
Page 270 and 271:
-qunwind Option Syntax -qunwind |-q
Page 272 and 273:
-qwarn64 Option See “-qwarn64 Opt
Page 274 and 275: -qxflag=xalias Option Syntax -qxfla
Page 276 and 277: intxor | nointxor Treats .XOR. as a
Page 278 and 279: Examples Consider the following pro
Page 280 and 281: In this example, conditional compil
Page 282 and 283: -qzerosize Option Syntax -qzerosize
Page 284 and 285: -t Option Syntax -tcomponents Appli
Page 286 and 287: -u Option Syntax -u UNDEF | NOUNDEF
Page 288 and 289: -V Option Syntax -V This option is
Page 290 and 291: $ ./a.out 1 2 3 4 276 XL Fortran En
Page 292 and 293: -y Option Syntax -y{n | m | p | z}
Page 294 and 295: 64-Bit Thread Support On AIX Versio
Page 296 and 297: -q64 Option Syntax -q64[=largetype]
Page 298 and 299: -qwarn64 Option Syntax -qwarn64 | -
Page 300 and 301: 286 XL Fortran Enterprise Edition f
Page 302 and 303: v v If the data or program code con
Page 304 and 305: v Comparisons involving NaN values
Page 306 and 307: How XL Fortran Rounds Floating-Poin
Page 308 and 309: 2. For thread-safety and reentrancy
Page 310 and 311: v The fltint suboption speeds up fl
Page 312 and 313: There are other related operating s
Page 314 and 315: Another method is to use the ieee_s
Page 316 and 317: call fpgets(fpstat) fpstat(fpox) =
Page 318 and 319: The following example shows the dif
Page 320 and 321: v Code that might cause an exceptio
Page 322 and 323: Optimization levels Option -qnoopt/
Page 326 and 327: v v v Reducing the costs of memory
Page 328 and 329: Loop unrolling also increases code
Page 330 and 331: 2. Run the application using a typi
Page 332 and 333: # Let the compiler decide (relative
Page 334 and 335: can also use the SNAPSHOT directive
Page 336 and 337: -qipa=list cross-file type checking
Page 338 and 339: 324 XL Fortran Enterprise Edition f
Page 340 and 341: Stream-access formatted files: A re
Page 342 and 343: I/O Redirection The important thing
Page 344 and 345: Default Record Lengths File Permiss
Page 346 and 347: XLFRTEOPTS="unit_vars=yes" # Allow
Page 348 and 349: Logical Volume I/O To use a logical
Page 350 and 351: READ(unit_number,ID=idvar) a .... W
Page 352 and 353: 450 end do close(20) end Performanc
Page 354 and 355: Table 22. Table for Binding an Appl
Page 356 and 357: v The result of an application does
Page 358 and 359: Sometimes an application can guaran
Page 360 and 361: v v The XL compilers generate code
Page 362 and 363: #include template class junk { pri
Page 364 and 365: Notes: 1. In interlanguage communic
Page 366 and 367: Table 25. Escape Sequences for Char
Page 368 and 369: You can use this built-in function
Page 370 and 371: The system linkage convention passe
Page 372 and 373: Run-time Stack for 32-bit Environme
Page 374 and 375:
processing, and the fifth doublewor
Page 376 and 377:
v v v v v v v In a 32-bit Environme
Page 378 and 379:
Will Be Passed In: R3 0 Storage Map
Page 380 and 381:
Function Values Functions return th
Page 382 and 383:
Example The control structure has t
Page 384 and 385:
technique can reduce the amount of
Page 386 and 387:
Note: When you run an XL Fortran pr
Page 388 and 389:
Duplicating Extensions from Other S
Page 390 and 391:
Input/Output Errors If the error de
Page 392 and 393:
-> xlf95 -qddim -g testprog.f -o te
Page 394 and 395:
7. The next step is to research why
Page 396 and 397:
Note that you cannot use the -qextn
Page 398 and 399:
v v Memory leak reporting that indi
Page 400 and 401:
HD_FILL HD_STACK=n called a certain
Page 402 and 403:
388 XL Fortran Enterprise Edition f
Page 404 and 405:
Source Section The source section c
Page 406 and 407:
Attribute and Cross-Reference Secti
Page 408 and 409:
Page 410 and 411:
Automating Large, Complex Compilati
Page 412 and 413:
For fixed source form code, in addi
Page 414 and 415:
Common Industry Extensions That XL
Page 416 and 417:
Page 418 and 419:
Page 420 and 421:
Example 2 - Valid C Routine Source
Page 422 and 423:
close(10) open(10, file="fun10.out"
Page 424 and 425:
!**********************************
Page 426 and 427:
operating system require the same o
Page 428 and 429:
Examples of Storage Relationships f
Page 430 and 431:
0 4 8 16 32 64 COMPLEX (16) COMPLEX
Page 432 and 433:
! Data values between r8 and x16 ar
Page 434 and 435:
end function sqrt real*8 function r
Page 436 and 437:
Table 34. MASS Vector Library Funct
Page 438 and 439:
Page 440 and 441:
Page 442 and 443:
IBM may use or distribute any of th
Page 444 and 445:
Page 446 and 447:
ind. To relate an identifier to ano
Page 448 and 449:
expression. A sequence of operands,
Page 450 and 451:
name. A lexical token consisting of
Page 452 and 453:
labels can be used to transfer cont
Page 454 and 455:
Page 456 and 457:
.profile file 12 .s files 33, 34 .X
Page 458 and 459:
fdate_ service and utility subprogr
Page 460 and 461:
nonested_par suboption of -qsmp 234
Page 462:
vi text editor 29 W W error severit
show all

XL Fortran Enterprise Edition for AIX : User's Guide - IBM

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?