QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3–TrueScale Cluster Setup and Administration<br />
<strong>QLogic</strong> Distributed Subnet Administration<br />
If you are using the Installer tool, you can set the OpenSM default behavior at the<br />
time of installation.<br />
OpenSM only needs to be enabled on the node that acts as the subnet manager,<br />
so use the chkconfig command (as a root user) to enable it on the node where it<br />
will be run:<br />
chkconfig opensmd on<br />
The command to disable it on reboot is:<br />
chkconfig opensmd off<br />
You can start opensmd without rebooting your machine by typing:<br />
/etc/init.d/opensmd start<br />
You can stop opensmd again by typing:<br />
/etc/init.d/opensmd stop<br />
If you want to pass any arguments to the OpenSM program, modify the following<br />
file, and add the arguments to the OPTIONS variable:<br />
/etc/init.d/opensmd<br />
For example:<br />
Use the UPDN algorithm instead of the Min Hop algorithm.<br />
OPTIONS="-R updn"<br />
For more information on OpenSM, see the OpenSM man pages, or look on the<br />
OpenFabrics web site.<br />
<strong>QLogic</strong> Distributed Subnet Administration<br />
As InfiniBand clusters are scaled into the Petaflop range and beyond, a more<br />
efficient method for handling queries to the Fabric Manager is required. One of<br />
these issues is that while the Fabric Manager can configure and operate that<br />
many nodes, under certain conditions it can become overloaded with queries from<br />
those same nodes.<br />
For example, consider an InfiniBand fabric consisting of 1,000 nodes, each with 4<br />
processors. When a large MPI job is started across the entire fabric, each process<br />
needs to collect InfiniBand path records for every other node in the fabric - and<br />
every single process is going to be querying the subnet manager for these path<br />
records at roughly the same time. This amounts to a total of 3.9 million path<br />
queries just to start the job!<br />
In the past, MPI implementations have side-stepped this problem by hand crafting<br />
path records themselves, but this solution cannot be used if advanced fabric<br />
management techniques such as virtual fabrics and mesh/torus configurations are<br />
being used. In such cases, only the subnet manager itself has enough information<br />
to correctly build a path record between two nodes.<br />
D000046-005 B 3-13