QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
F–Troubleshooting<br />
<strong>QLogic</strong> MPI Troubleshooting<br />
See “Compiler Cannot Find Include, Module, or Library Files” on page F-14,<br />
“Compiling on Development Nodes” on page F-15, and “Specifying the Run-time<br />
Library Path” on page F-15 for additional information.<br />
Process Limitation with ssh<br />
MPI jobs that use more than eight processes per node may encounter an ssh<br />
throttling mechanism that limits the amount of concurrent per-node connections<br />
to 10. If you have this problem, a message similar to this appears when using<br />
mpirun:<br />
$ mpirun -m tmp -np 11 ~/mpi/mpiworld/mpiworld<br />
ssh_exchange_identification: Connection closed by remote host<br />
MPIRUN: Node program(s) exitted during connection setup<br />
If you encounter a message like this, you or your system administrator should<br />
increase the value of MaxStartups in your sshd configurations.<br />
NOTE:<br />
This limitation applies only if -distributed=off is specified. By default,<br />
with -distributed=on, you will not normally have this problem.<br />
Number of Processes Exceeds ulimit for Number of Open<br />
Files<br />
When users scale up the number of processes beyond the number of open files<br />
allowed by ulimit, mpirun will print an error message. The ulimit for the<br />
number of open files is typically 1024 on both Red Hat and SLES systems. The<br />
message will look similar to:<br />
MPIRUN.up001: Warning: ulimit for the number of open files is only<br />
1024, but this mpirun request requires at least <br />
open files (sockets). The shell ulimit for open files needs to be<br />
increased.<br />
This is due to limit:<br />
descriptors 1024<br />
The ulimit can be increased; <strong>QLogic</strong> recommends an increase of<br />
approximately 20 percent over the number of CPUs. For example, in the case of<br />
2048 CPUs, ulimit can be increased to 2500:<br />
ulimit -n 2500<br />
The ulimit needs to be increased only on the host where mpirun was started,<br />
unless the mode of operation allows mpirun from any node.<br />
D000046-005 B F-19