09.03.2015 Views

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Diagnostics and <strong>Troubleshooting</strong> <strong>Reference</strong> <strong>Manual</strong> – Virtual SAN<br />

Physical network switch configurations and flow control<br />

There have been situations where misbehaving network switches have led to Virtual<br />

SAN network outages. Symptoms include hosts unable to communicate, and<br />

exceedingly high latency reported for virtual machine I/O in <strong>VSAN</strong> Observer.<br />

However when latency is examined at the <strong>VSAN</strong> Disks layer, there is no latency,<br />

which immediately points to latency being incurred at the network layer.<br />

In one case, it was observed that the physical network switch in question was<br />

sending excessive amounts of Pause frames. Pause frames are a flow control<br />

mechanism that is designed to stop or reduce the transmission of packets for an<br />

interval of time. This behavior negatively impacted the Virtual SAN network<br />

performance.<br />

ethtool<br />

There is a command on the ESXi host called ethtool to check for flow control. Here is<br />

an example output:<br />

~ # ethtool -a vmnic4<br />

Pause parameters for vmnic4:<br />

Autonegotiate: on<br />

RX:<br />

off<br />

TX:<br />

off<br />

This output shows that auto-negotiate is set to on, which is recommended for ESXi<br />

host NICs, but that there is no flow control enabled on the switch (RX and TX are<br />

both off).<br />

In the example outage discussed earlier, there were excessive amounts of pause<br />

frames in the RX field, with values in the millions. In this case, one troubleshooting<br />

step might be to disable the flow control mechanisms on the switch while further<br />

investigation into the root cause takes place.<br />

Physical network switch feature interoperability<br />

There have been situations where certain features, when enabled on a physical<br />

network switch, did not interoperate correctly. In one example, a customer<br />

attempted to use multicast with jumbo frames, and because of the inability of the<br />

network switch to handle both these features, it impacted the whole of the Virtual<br />

SAN network. Note that many other physical switches handled this perfectly; this<br />

was an issue with one switch vendor only.<br />

Pay due diligence to whether or not the physical network switch has the ability to<br />

support multiple network features enabled concurrently.<br />

V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 1 2 2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!