VSAN-Troubleshooting-Reference-Manual
VSAN-Troubleshooting-Reference-Manual
VSAN-Troubleshooting-Reference-Manual
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Diagnostics and <strong>Troubleshooting</strong> <strong>Reference</strong> <strong>Manual</strong> – Virtual SAN<br />
What is I/O per second (IOPS)?<br />
IOPS gives a measure of number of Input/Output Operations Per Second of a storage<br />
system. An I/O operation is typically a read or a write operation and a size. I/O size<br />
can vary between a few bytes and several megabytes.<br />
If high IOPS is observed in Virtual SAN, it does not necessarily mean that we have a<br />
problem. It could simply mean that we are using the storage to its maximum. For<br />
example, a disk clone operation may try to use all available IOPS to complete the<br />
operation in least possible time. Similarly, a low IOPS value does not mean that<br />
there is an immediate problem either; it could simply be that the I/O sizes are very<br />
large, and because of the large I/O sizes, it implies that we do fewer IOPS. Typically<br />
large numbers for IOPS is better.<br />
What is bandwidth?<br />
Bandwidth (or thoroughput) measures the data rate or throughput that a storage<br />
device is capable of, or to put it another way, how much data is being transferred.<br />
IOPS and bandwidth are related. When small I/O sizes (e.g. 4KB) are involved, a disk<br />
device may hit the maximum IOPS ceiling before exhausting the available bandwidth<br />
provided by the device, or controller, or the underlying physical link. Conversely,<br />
for large I/O sizes (e.g. 1MB) the bandwidth may become a limiting factor before the<br />
maximum IOPS of a device or controller is reached. Larger numbers for bandwidth<br />
are better.<br />
When troubleshooting storage performance, look at IOPS, I/O sizes, outstanding I/O,<br />
and latency to get a complete picture. Throughput and latency increase in an almost<br />
linear fashion as the size of your I/O increases, while at the same time the total<br />
number of I/O (IOPS) will decrease.<br />
What is congestion?<br />
Congestion in Virtual SAN happens when the lower layers fail to keep up with the<br />
I/O rate of higher layers.<br />
For example, if virtual machines are performing a lot of write operations, it could<br />
lead to filling up of write buffers on the flash cache device. These buffers have to be<br />
destaged to magnetic disks in hybrid configurations. However, this de-staging can<br />
only be done at a rate at which the magnetic disks in a hybrid configuration can<br />
handle, much slower typically than flash device performance. This can causes<br />
Virtual SAN to artificially introduce latencies in the virtual machines in order to<br />
slow down writes to the flash device so that write buffers can be freed up.<br />
Other reasons for congestion could be related to faulty hardware, bad or<br />
misbehaving drivers/firmware or insufficient I/O controller queue depths.<br />
V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 2 0 7