09.03.2015 Views

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

VSAN-Troubleshooting-Reference-Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Diagnostics and <strong>Troubleshooting</strong> <strong>Reference</strong> <strong>Manual</strong> – Virtual SAN<br />

VM Accessibility: inaccessible vs. orphaned<br />

A VM’s accessibility can also be dependent on which objects have been affected by a<br />

failure. These states will only be observed if a VM’s objects have been setup to<br />

tolerate X number of failures in the cluster, and the cluster suffers greater than X<br />

number of failures.<br />

If a VM’s state is reported as Inaccessible, it means that at least one object of the VM<br />

is completely down (temporarily or permanently) so either there is no full mirror of<br />

the object (the failures have impacted both mirrors), or less than 50% of the<br />

components or votes are available (the failures have impacted a mirror and<br />

witnesses).<br />

If a VM’s state is reported as Orphaned, it may mean that neither vCenter Server nor<br />

the ESXi host can monitor/track the VM, i.e. there is no read access to VM’s “.vmx”<br />

file. From a Virtual SAN perspective, this could imply that the VM Home Namespace<br />

is currently down, meaning that there is no full mirror of the object or less than 50%<br />

of the components or votes that make up the VM Home Namespace are available.<br />

There are other reasons why a VM may be orphaned, but a problem with the VM<br />

Home Namespace could be one of them.<br />

However it should be understood that this state is a transient, and not a permanent<br />

state. As soon as the underlying issue has been rectified and once a full mirror copy<br />

and more than 50% of an object’s components or votes become available, the virtual<br />

machine would automatically exit this inaccessible or orphaned state and become<br />

accessible once again.<br />

Failure handling – Virtual SAN fail safe mechanism<br />

There have been occasions where the storage I/O controller does not tell the ESXi<br />

host that anything is wrong; rather the controller may simply stall I/O, and become<br />

‘wedged’. There have also been cases when pulling disk drives isn’t handled well by<br />

the controller. For these reasons Virtual SAN has it’s own fail-safe mechanism for<br />

failure handling.<br />

Virtual SAN handles this condition by placing its own timeout on in-flight I/O, which<br />

is somewhere between 20-30 seconds. If the timeout expires (because the controller<br />

is stalled), Virtual SAN will mark the devices associated with the timed out I/O as<br />

DEGRADED.<br />

If an I/O controller “wedged”, and the I/O was destined for a capacity layer device,<br />

Virtual SAN marks this disk as DEGRADED. If the I/O was destined for a flash layer<br />

device, Virtual SAN marks the device as DEGRADED. Over time, all of the devices<br />

sitting behind this controller will also be marked as DEGRADED.<br />

V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!