VSAN-Troubleshooting-Reference-Manual
VSAN-Troubleshooting-Reference-Manual
VSAN-Troubleshooting-Reference-Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Diagnostics and <strong>Troubleshooting</strong> <strong>Reference</strong> <strong>Manual</strong> – Virtual SAN<br />
VM Accessibility: inaccessible vs. orphaned<br />
A VM’s accessibility can also be dependent on which objects have been affected by a<br />
failure. These states will only be observed if a VM’s objects have been setup to<br />
tolerate X number of failures in the cluster, and the cluster suffers greater than X<br />
number of failures.<br />
If a VM’s state is reported as Inaccessible, it means that at least one object of the VM<br />
is completely down (temporarily or permanently) so either there is no full mirror of<br />
the object (the failures have impacted both mirrors), or less than 50% of the<br />
components or votes are available (the failures have impacted a mirror and<br />
witnesses).<br />
If a VM’s state is reported as Orphaned, it may mean that neither vCenter Server nor<br />
the ESXi host can monitor/track the VM, i.e. there is no read access to VM’s “.vmx”<br />
file. From a Virtual SAN perspective, this could imply that the VM Home Namespace<br />
is currently down, meaning that there is no full mirror of the object or less than 50%<br />
of the components or votes that make up the VM Home Namespace are available.<br />
There are other reasons why a VM may be orphaned, but a problem with the VM<br />
Home Namespace could be one of them.<br />
However it should be understood that this state is a transient, and not a permanent<br />
state. As soon as the underlying issue has been rectified and once a full mirror copy<br />
and more than 50% of an object’s components or votes become available, the virtual<br />
machine would automatically exit this inaccessible or orphaned state and become<br />
accessible once again.<br />
Failure handling – Virtual SAN fail safe mechanism<br />
There have been occasions where the storage I/O controller does not tell the ESXi<br />
host that anything is wrong; rather the controller may simply stall I/O, and become<br />
‘wedged’. There have also been cases when pulling disk drives isn’t handled well by<br />
the controller. For these reasons Virtual SAN has it’s own fail-safe mechanism for<br />
failure handling.<br />
Virtual SAN handles this condition by placing its own timeout on in-flight I/O, which<br />
is somewhere between 20-30 seconds. If the timeout expires (because the controller<br />
is stalled), Virtual SAN will mark the devices associated with the timed out I/O as<br />
DEGRADED.<br />
If an I/O controller “wedged”, and the I/O was destined for a capacity layer device,<br />
Virtual SAN marks this disk as DEGRADED. If the I/O was destined for a flash layer<br />
device, Virtual SAN marks the device as DEGRADED. Over time, all of the devices<br />
sitting behind this controller will also be marked as DEGRADED.<br />
V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 60