VSAN-Troubleshooting-Reference-Manual

Recommendations

Info

Diagnostics and Troubleshooting Reference Manual – Virtual SAN 2014-08-24T17:00:30.912Z cpu33:7027198)WARNING: LSOM: LSOMEventNotify:4570: VSAN device 52378176-a9da-7bce-0526-cdf1d863b3b5 is under permanent error. 2014-08-24T17:00:30.912Z cpu33:7027198)WARNING: LSOM: RCVmfsIoCompletion:99: Throttled: VMFS IO failed. Wake up 0x4136af9a69c0 with status Maximum kernel-level retries exceeded 2014-08-24T17:00:30.912Z cpu33:7027198)WARNING: LSOM: RCDrainAfterBERead:5070: Changing the status of child state from Success to Maximum kernel-level retries exceeded Eventually, the firmware reset on the controller completed, but by this time, it is too late and Virtual SAN had already marked the disks as failed: 2014-08-24T17:00:49.279Z cpu21:33542)megasas: FW now in Ready state 2014-08-24T17:00:49.299Z cpu21:33542)megasas:IOC Init cmd success 2014-08-24T17:00:49.320Z cpu36:33542)megaraid_sas: Reset successful. When a controller ‘wedges’ like this, Virtual SAN will retry I/O for a finite amount of time. In this case, it took a full 24 seconds (2400ms) for the adapter to come back online after resetting. This was too long for Virtual SAN, which meant that the maximum retries threshold had been exceeded. This in turn led to Virtual SAN marking the disks as DEGRADED. Virtual SAN is responding as designed here. The problem is that the firmware crashed. This particular issue was resolved by using recommended versions of MegaRAID driver and firmware as per the VMware Compatibility Guide. Storage controller replacement In general, controller replacement should be for the same make and model and administrators should not be swapping a pass-through controller with a RAID 0 controller or vice-versa. Expectations when a drive is reporting errors In this scenario, a disk drive is reporting errors due to bad blocks. If a read I/O accessing component data on behalf of a virtual machine fails in such a manner, Virtual SAN will check other replicas of the component to satisfy the read. If another mirror can satisfy the read, Virtual SAN will attempt to write the good data onto the disk that reported the bad read. If this procedure succeeds, the disk does not enter a DEGRADED state. Note that this is not the behavior if we get a read error when accessing Virtual SAN metadata. If a read of metadata, or any write fails, Virtual SAN will mark all components of the disk as DEGRADED. It treats the disk as failed and the data is no longer usable. Upon entering the DEGRADED state, Virtual SAN will restore I/O flow immediately (by taking the bad component out of the active set of the effected object) and try to re-protect the object by creating a new replica of the component somewhere else in the cluster. V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 1 6 2
Diagnostics and Troubleshooting Reference Manual – Virtual SAN Blinking LEDs on drives Virtual SAN 6.0 introduces new functionality to make it easier to identify the location of disk drives within a datacenter. This new functionality is available via the vSphere web client. Note that this is supported only with specific controller/enclosure combinations. The user should refer to the VCG for details. Navigate to the Disk Management section of Virtual SAN, and select one of the disks in the disk group. Administrators will be able to turn on and off the locator LED of a particular disk using the icons highlighted below: V M W A R E S T O R A G E B U D O C U M E N T A T I O N / 1 6 3
Page 1 and 2:
Diagnostics and Troubleshooting Gui
Page 3 and 4:
Diagnostics and Troubleshooting Ref
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112: Diagnostics and Troubleshooting Ref
Page 161: Diagnostics and Troubleshooting Ref
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
Page 227 and 228:
Page 229 and 230:
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Page 239 and 240:
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
Page 263 and 264:
Page 265 and 266:
Page 267 and 268:
Page 269 and 270:
Page 271 and 272:
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Page 287 and 288:
Page 289 and 290:
Page 291 and 292:
Page 293 and 294:
Page 295:
show all

VSAN-Troubleshooting-Reference-Manual

Create successful ePaper yourself

Delete template?

Save as template?