PGPROF User's Guide - The Portland Group
PGPROF User's Guide - The Portland Group
PGPROF User's Guide - The Portland Group
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Using <strong>PGPROF</strong><br />
Table 2 MPI Profiling Options<br />
This MPI distribution... Requires compiling and linking with these options ...<br />
MPICH1<br />
Deprecated. -Mprof =mpich1,{func|lines|time}<br />
MPICH2<br />
Deprecated. -Mprof =mpich2,{func|lines|time}<br />
MPICH v3 -Mprof =mpich,{func|lines|time}<br />
MVAPICH1<br />
Deprecated. -Mprof =mvapich1,{func|lines|time}<br />
MVAPICH2<br />
MS-MPI<br />
Open MPI<br />
SGI MPI<br />
Use MVAPICH2 compiler wrappers:<br />
-profile={profcc|proffer}<br />
-Mprof ={func|lines|time}<br />
-Mprof =msmpi,{func|lines}<br />
Use Open MPI compiler wrappers:<br />
-Mprof ={func|lines|time}<br />
-Mprof =sgimpi,{func|lines|time}<br />
For more details about how to compile an MPI program for profiling, refer to the ‘Using MPI’<br />
section of the PGI Compiler User‘s <strong>Guide</strong>.<br />
Once you have built an instrumented version of your MPI application, running it as you normally<br />
would produces the MPI profile data.<br />
On successful program termination, one profile data file is created for each MPI process.<br />
<strong>The</strong> master profile data file is named pgprof.out. <strong>The</strong> other files have names similar to<br />
pgprof.out, but they are numbered.<br />
<strong>PGPROF</strong> MPI profiling collects counts of the number of messages and bytes sent and received.<br />
You can then use this information to analyze a program's message passing behavior.<br />
Analyzing the Performance of MPI Programs<br />
Figure 9 illustrates an MPI profile.<br />
This sample shows an example MPI profile with maximum times and counts in the Statistics<br />
Table, and per-process measurements in the Parallelism tab. <strong>The</strong> Parallelism tab for MPI<br />
programs is used in the same way that it is used for multi-threaded programs, as described in<br />
Analyzing the Performance of Multi-Threaded Programs.<br />
You can use the send and receive counts for messages, the byte counts to identify potential<br />
communication bottlenecks, and the process-specific data to find load imbalances.<br />
PGI Profiler User <strong>Guide</strong> 18