QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
E–Integration with a Batch Queuing System<br />
Using SLURM for Batch Queuing<br />
The sort | uniq -c component determines the number of times each unique<br />
line was printed. The awk command converts the result into the mpihosts file<br />
format used by mpirun. Each line consists of a node name, a colon, and the<br />
number of processes to start on that node.<br />
NOTE:<br />
This is one of two formats that the file can use. See “Console I/O in MPI<br />
Programs” on page 4-18 for more information.<br />
Simple Process Management<br />
At this point, the script has enough information to be able to run an MPI program.<br />
The next step is to start the program when the batch system is ready, and notify<br />
the batch system when the job completes. This is done in the final part of<br />
batch_mpirun, for example:<br />
mpirun -np $np -m $mpihosts_file "$mpi_prog" $@<br />
exit_code=$<br />
scancel ${SLURM_JOBID}<br />
rm -f $mpihosts_file<br />
exit $exit_code<br />
Clean Termination of MPI Processes<br />
The InfiniPath software normally ensures clean termination of all MPI programs<br />
when a job ends, but in some rare circumstances an MPI process may remain<br />
alive, and potentially interfere with future MPI jobs. To avoid this problem, run a<br />
script before and after each batch job that kills all unwanted processes. <strong>QLogic</strong><br />
does not provide such a script, but it is useful to know how to find out which<br />
processes on a node are using the <strong>QLogic</strong> interconnect. The easiest way to do<br />
this is with the fuser command, which is normally installed in /sbin.<br />
Run these commands as a root user to ensure that all processes are reported.<br />
# /sbin/fuser -v /dev/ipath<br />
/dev/ipath: 22648m 22651m<br />
In this example, processes 22648 and 22651 are using the <strong>QLogic</strong> interconnect. It<br />
is also possible to use this command (as a root user):<br />
# lsof /dev/ipath<br />
This command displays a list of processes using InfiniPath. Additionally, to get all<br />
processes, including stats programs, ipath_sma, diags, and others, run the<br />
program in this way:<br />
# /sbin/fuser -v /dev/ipath*<br />
lsof can also take the same form:<br />
# lsof /dev/ipath*<br />
E-4 D000046-005 B