QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
QLogic OFED+ Host Software User Guide, Rev. B
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
F–Troubleshooting<br />
Performance Issues<br />
In large cluster configurations, switches may be attached to other switches to<br />
supply the necessary inter-node connectivity. Problems with these inter-switch (or<br />
intermediate) links are sometimes more difficult to diagnose than failure of the<br />
final link between a switch and a node. The failure of an intermediate link may<br />
allow some traffic to pass through the fabric while other traffic is blocked or<br />
degraded.<br />
If you notice this behavior in a multi-layer fabric, check that all switch cable<br />
connections are correct. Statistics for managed switches are available on a<br />
per-port basis, and may help with debugging. See your switch vendor for more<br />
information.<br />
<strong>QLogic</strong> recommends using FastFabric to help diagnose this problem. If<br />
FastFabric is not installed in the fabric, there are two diagnostic tools, ibhosts<br />
and ibtracert, that may also be helpful. The tool ibhosts lists all the<br />
InfiniBand nodes that the subnet manager recognizes. To check the InfiniBand<br />
path between two nodes, use the ibtracert command.<br />
Performance Issues<br />
The following sections discuss known performance issues.<br />
Large Message Receive Side Bandwidth Varies with<br />
Socket Affinity on Opteron Systems<br />
On Opteron systems, when using the QLE7240 or QLE7280 in DDR mode, there<br />
is a receive side bandwidth bottleneck for CPUs that are not adjacent to the PCI<br />
Express root complex. This may cause performance to vary. The bottleneck is<br />
most obvious when using SendDMA with large messages on the farthest sockets.<br />
The best case for SendDMA is when both sender and receiver are on the closest<br />
sockets. Overall performance for PIO (and smaller messages) is better than with<br />
SendDMA.<br />
Erratic Performance<br />
Sometimes erratic performance is seen on applications that use interrupts. An<br />
example is inconsistent SDP latency when running a program such as netperf.<br />
This may be seen on AMD-based systems using the QLE7240 or QLE7280<br />
adapters. If this happens, check to see if the program irqbalance is running.<br />
This program is a Linux daemon that distributes interrupts across processors.<br />
However, it may interfere with prior interrupt request (IRQ) affinity settings,<br />
introducing timing anomalies. After stopping this process (as a root user), bind<br />
IRQ to a CPU for more consistent performance. First, stop irqbalance:<br />
# /sbin/chkconfig irqbalance off<br />
# /etc/init.d/irqbalance stop<br />
D000046-005 B F-9