09.03.2015 Views

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Diagnostics and <strong>Troubleshooting</strong> <strong>Reference</strong> <strong>Manual</strong> – Virtual SAN<br />

What is I/O per second (IOPS)?<br />

IOPS gives a measure of number of Input/Output Operations Per Second of a storage<br />

system. An I/O operation is typically a read or a write operation and a size. I/O size<br />

can vary between a few bytes and several megabytes.<br />

If high IOPS is observed in Virtual SAN, it does not necessarily mean that we have a<br />

problem. It could simply mean that we are using the storage to its maximum. For<br />

example, a disk clone operation may try to use all available IOPS to complete the<br />

operation in least possible time. 
 Similarly, a low IOPS value does not mean that<br />

there is an immediate problem either; it could simply be that the I/O sizes are very<br />

large, and because of the large I/O sizes, it implies that we do fewer IOPS. Typically<br />

large numbers for IOPS is better.<br />

What is bandwidth?<br />

Bandwidth (or thoroughput) measures the data rate or throughput that a storage<br />

device is capable of, or to put it another way, how much data is being transferred.<br />

IOPS and bandwidth are related. When small I/O sizes (e.g. 4KB) are involved, a disk<br />

device may hit the maximum IOPS ceiling before exhausting the available bandwidth<br />

provided by the device, or controller, or the underlying physical link. Conversely,<br />

for large I/O sizes (e.g. 1MB) the bandwidth may become a limiting factor before the<br />

maximum IOPS of a device or controller is reached. Larger numbers for bandwidth<br />

are better.<br />

When troubleshooting storage performance, look at IOPS, I/O sizes, outstanding I/O,<br />

and latency to get a complete picture. Throughput and latency increase in an almost<br />

linear fashion as the size of your I/O increases, while at the same time the total<br />

number of I/O (IOPS) will decrease.<br />

What is congestion?<br />

Congestion in Virtual SAN happens when the lower layers fail to keep up with the<br />

I/O rate of higher layers.<br />

For example, if virtual machines are performing a lot of write operations, it could<br />

lead to filling up of write buffers on the flash cache device. These buffers have to be<br />

destaged to magnetic disks in hybrid configurations. However, this de-staging can<br />

only be done at a rate at which the magnetic disks in a hybrid configuration can<br />

handle, much slower typically than flash device performance. This can causes<br />

Virtual SAN to artificially introduce latencies in the virtual machines in order to<br />

slow down writes to the flash device so that write buffers can be freed up.<br />

Other reasons for congestion could be related to faulty hardware, bad or<br />

misbehaving drivers/firmware or insufficient I/O controller queue depths.<br />

V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 2 0 7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!