27.12.2014 Views

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3–TrueScale Cluster Setup and Administration<br />

Performance Settings and Management Tips<br />

On Intel Harpertown CPUs, it may be beneficial to add a<br />

pcie_coalesce=1 parameter to this line.<br />

On AMD CPUs (PCIe Gen1) no ib_qib parameter changes are<br />

recommended.<br />

Alternatively, these PCIe parameters can also be set in the BIOS on some<br />

systems.<br />

• Use PCIe Max Payload size of 256, where available, with the QLE7340,<br />

QLE7342, QLE7240 and QLE7280. The QLE7240 and QLE7280 adapters<br />

can support 128, 256, or 512 bytes. This value is typically set by the BIOS<br />

as the minimum value supported both by the PCIe card and the PCIe root<br />

complex.<br />

• Make sure that write combining is enabled. The x86 Page Attribute Table<br />

(PAT) mechanism that allocates Write Combining (WC) mappings for the<br />

PIO buffers has been added and is now the default. If PAT is unavailable or<br />

PAT initialization fails for some reason, the code will generate a message in<br />

the log and fall back to the MTRR mechanism. See Appendix H Write<br />

Combining for more information.<br />

• Check the PCIe bus width. If slots have a smaller electrical width than<br />

mechanical width, lower than expected performance may occur. Use this<br />

command to check PCIe Bus width:<br />

$ ipath_control -iv<br />

This command also shows the link speed.<br />

• Experiment with non-default CPU affinity while running<br />

single-process-per-node latency or bandwidth benchmarks. Latency<br />

may be slightly lower when using different CPUs (cores) from the default. On<br />

some chipsets, bandwidth may be higher when run from a non-default CPU<br />

or core. See “Performance Tuning” on page 4-23 for more information on<br />

using taskset with <strong>QLogic</strong> MPI. With another MPI, look at its documentation<br />

to see how to force a benchmark to run with a different CPU affinity than the<br />

default. With OFED micro benchmarks such as from the qperf or perftest<br />

suites, taskset will work for setting CPU affinity.<br />

Turn C-state Off to improve MPI latency (ping-pong) benchmarks on<br />

Nehalem systems. In the BIOS, look for advanced CPU settings, and set the<br />

C-State parameter to "disable."<br />

D000046-005 B 3-27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!