27.12.2014 Views

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

QLogic OFED+ Host Software User Guide, Rev. B

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

F–Troubleshooting<br />

Performance Issues<br />

In large cluster configurations, switches may be attached to other switches to<br />

supply the necessary inter-node connectivity. Problems with these inter-switch (or<br />

intermediate) links are sometimes more difficult to diagnose than failure of the<br />

final link between a switch and a node. The failure of an intermediate link may<br />

allow some traffic to pass through the fabric while other traffic is blocked or<br />

degraded.<br />

If you notice this behavior in a multi-layer fabric, check that all switch cable<br />

connections are correct. Statistics for managed switches are available on a<br />

per-port basis, and may help with debugging. See your switch vendor for more<br />

information.<br />

<strong>QLogic</strong> recommends using FastFabric to help diagnose this problem. If<br />

FastFabric is not installed in the fabric, there are two diagnostic tools, ibhosts<br />

and ibtracert, that may also be helpful. The tool ibhosts lists all the<br />

InfiniBand nodes that the subnet manager recognizes. To check the InfiniBand<br />

path between two nodes, use the ibtracert command.<br />

Performance Issues<br />

The following sections discuss known performance issues.<br />

Large Message Receive Side Bandwidth Varies with<br />

Socket Affinity on Opteron Systems<br />

On Opteron systems, when using the QLE7240 or QLE7280 in DDR mode, there<br />

is a receive side bandwidth bottleneck for CPUs that are not adjacent to the PCI<br />

Express root complex. This may cause performance to vary. The bottleneck is<br />

most obvious when using SendDMA with large messages on the farthest sockets.<br />

The best case for SendDMA is when both sender and receiver are on the closest<br />

sockets. Overall performance for PIO (and smaller messages) is better than with<br />

SendDMA.<br />

Erratic Performance<br />

Sometimes erratic performance is seen on applications that use interrupts. An<br />

example is inconsistent SDP latency when running a program such as netperf.<br />

This may be seen on AMD-based systems using the QLE7240 or QLE7280<br />

adapters. If this happens, check to see if the program irqbalance is running.<br />

This program is a Linux daemon that distributes interrupts across processors.<br />

However, it may interfere with prior interrupt request (IRQ) affinity settings,<br />

introducing timing anomalies. After stopping this process (as a root user), bind<br />

IRQ to a CPU for more consistent performance. First, stop irqbalance:<br />

# /sbin/chkconfig irqbalance off<br />

# /etc/init.d/irqbalance stop<br />

D000046-005 B F-9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!