27.12.2014 Views

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

E–Integration with a Batch Queuing System<br />

Using SLURM for Batch Queuing<br />

The sort | uniq -c component determines the number of times each unique<br />

line was printed. The awk command converts the result into the mpihosts file<br />

format used by mpirun. Each line consists of a node name, a colon, and the<br />

number of processes to start on that node.<br />

NOTE:<br />

This is one of two formats that the file can use. See “Console I/O in MPI<br />

Programs” on page 4-18 for more information.<br />

Simple Process Management<br />

At this point, the script has enough information to be able to run an MPI program.<br />

The next step is to start the program when the batch system is ready, and notify<br />

the batch system when the job completes. This is done in the final part of<br />

batch_mpirun, for example:<br />

mpirun -np $np -m $mpihosts_file "$mpi_prog" $@<br />

exit_code=$<br />

scancel ${SLURM_JOBID}<br />

rm -f $mpihosts_file<br />

exit $exit_code<br />

Clean Termination of MPI Processes<br />

The InfiniPath software normally ensures clean termination of all MPI programs<br />

when a job ends, but in some rare circumstances an MPI process may remain<br />

alive, and potentially interfere with future MPI jobs. To avoid this problem, run a<br />

script before and after each batch job that kills all unwanted processes. <strong>QLogic</strong><br />

does not provide such a script, but it is useful to know how to find out which<br />

processes on a node are using the <strong>QLogic</strong> interconnect. The easiest way to do<br />

this is with the fuser command, which is normally installed in /sbin.<br />

Run these commands as a root user to ensure that all processes are reported.<br />

# /sbin/fuser -v /dev/ipath<br />

/dev/ipath: 22648m 22651m<br />

In this example, processes 22648 and 22651 are using the <strong>QLogic</strong> interconnect. It<br />

is also possible to use this command (as a root user):<br />

# lsof /dev/ipath<br />

This command displays a list of processes using InfiniPath. Additionally, to get all<br />

processes, including stats programs, ipath_sma, diags, and others, run the<br />

program in this way:<br />

# /sbin/fuser -v /dev/ipath*<br />

lsof can also take the same form:<br />

# lsof /dev/ipath*<br />

E-4 D000046-005 B

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!