QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3–TrueScale Cluster Setup and Administration<br />
Performance Settings and Management Tips<br />
On Intel Harpertown CPUs, it may be beneficial to add a<br />
pcie_coalesce=1 parameter to this line.<br />
On AMD CPUs (PCIe Gen1) no ib_qib parameter changes are<br />
recommended.<br />
Alternatively, these PCIe parameters can also be set in the BIOS on some<br />
systems.<br />
• Use PCIe Max Payload size of 256, where available, with the QLE7340,<br />
QLE7342, QLE7240 and QLE7280. The QLE7240 and QLE7280 adapters<br />
can support 128, 256, or 512 bytes. This value is typically set by the BIOS<br />
as the minimum value supported both by the PCIe card and the PCIe root<br />
complex.<br />
• Make sure that write combining is enabled. The x86 Page Attribute Table<br />
(PAT) mechanism that allocates Write Combining (WC) mappings for the<br />
PIO buffers has been added and is now the default. If PAT is unavailable or<br />
PAT initialization fails for some reason, the code will generate a message in<br />
the log and fall back to the MTRR mechanism. See Appendix H Write<br />
Combining for more information.<br />
• Check the PCIe bus width. If slots have a smaller electrical width than<br />
mechanical width, lower than expected performance may occur. Use this<br />
command to check PCIe Bus width:<br />
$ ipath_control -iv<br />
This command also shows the link speed.<br />
• Experiment with non-default CPU affinity while running<br />
single-process-per-node latency or bandwidth benchmarks. Latency<br />
may be slightly lower when using different CPUs (cores) from the default. On<br />
some chipsets, bandwidth may be higher when run from a non-default CPU<br />
or core. See “Performance Tuning” on page 4-23 for more information on<br />
using taskset with <strong>QLogic</strong> MPI. With another MPI, look at its documentation<br />
to see how to force a benchmark to run with a different CPU affinity than the<br />
default. With OFED micro benchmarks such as from the qperf or perftest<br />
suites, taskset will work for setting CPU affinity.<br />
Turn C-state Off to improve MPI latency (ping-pong) benchmarks on<br />
Nehalem systems. In the BIOS, look for advanced CPU settings, and set the<br />
C-State parameter to "disable."<br />
D000046-005 B 3-27