12.07.2015 Views

LF95 Linux User's Guide - Lahey Computer Systems

LF95 Linux User's Guide - Lahey Computer Systems

LF95 Linux User's Guide - Lahey Computer Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 5 Multi-Processing (PRO version only)• Overhead for initiating and managing threads on secondary processors.• Lack of large arrays and loops operating on them.• I/O intensive rather than computationally intensive programs.• Potential for incorrect results.• Other unparallelizable loops.These impediments are discussed in the sections below.OverheadTime is spent whenever your program starts up or shuts down a thread (a separate stream ofexecution) on a secondary processor. This time can outweigh the time gained by running partof the code on a secondary processor if the work to be done on that processor is notsignificant.Lack of Large ArraysIf your program does not spend the bulk of its time in computationally intensive loops thenthere is not adequate work to divide among the processors. Your program will likely run atleast as fast without parallelization. For example, if half of your program’s time is spent inparallelizable loops then the best time savings you can expect by parallelization on two processorsis 25%. If your program takes two minutes to run serially, and half of its time is spentin parallelizable loops, then the theoretically optimal parallel run time is one minute andthirty seconds.I/O Intensive ProgramsIf your program spends much of its time reading or writing files or waiting for user input thenany speed increase due to parallelization will likely be dwarfed by the time spent doing I/O.Your program will likely not show a significant performance improvement.Potential for Incorrect ResultsCertain loops can be analyzed sufficiently to be parallelized by the compiler without inputfrom the programmer. However, many loops have data dependencies that would preventautomatic parallelization because of the potential for incorrect results. For that reason, LF64PRO includes optimization control lines (see “Optimization Control Line” on page 94) andOpenMP directives (see “OpenMP” on page 106), with which the programmer can providethe information necessary for the compiler to parallelize otherwise unparallelizable loops.Other Unparallelizable LoopsSome loops cannot be parallelized for other reasons discussed later in this chapter. Sometimesrecoding a loop to move a statement or group of statements outside the loop will allowthat loop to be parallelized.88 <strong>Lahey</strong>/Fujitsu <strong>Linux</strong>64 Fortran User’s <strong>Guide</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!