10.07.2015 Views

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 9 RECOVERING EXADATACALIBRATE results are within an acceptable range.CALIBRATE stress test is now running...Calibration has finished.Cell Flash Cache Failure<strong>Exadata</strong> storage cells come equipped with four F20 PCIe flash cache cards. Each card has four flashcache disks (FDOMs) for a total of 16 flash disks. These flash cache cards occupy slots 1, 2, 3, and 5 insidethe storage cell. If a flash cache module fails, performance of the storage cell will be degraded and itshould be replaced at your earliest opportunity. If you are using some of your flash cache for flash diskbasedgrid disks, your disk group redundancy will be affected as well. These flash cache cards are nothot-pluggable, so replacing them will require you to power off the affected cell.If a flash disk fails, <strong>Exadata</strong> will send you an email notifying you of the failure. The email will includethe slot address of the card, and if a specific FDOM has failed it will include the address of the FDOM onthe card (1, 2, 3 or 4). The failed flash cache card can be seen using the CellCLI command LISTPHYSICALDISK as follows:CellCLI> list physicaldisk where disktype=flashdisk and status=normalname: [4:0:3:0]diskType:FlashDiskluns: 2_3makeModel:"MARVELL SD88SA02"physicalFirmware: D20YphysicalInsertTime: 2010-07-28T20:09:43-05:00physicalInterface: sasphysicalSerial: 5080020000c7d60FMOD3physicalSize: 22.8880615234375GslotNumber: "PCI Slot: 2; FDOM: 3"status:criticalThe slotNumber attribute here shows you where the card and FDOM are installed. In our case, thecard is installed in PCIe slot 2. Once you have this information, you can shut down and power off thestorage cell and replace the defective part. Keep in mind that when the cell is offline, ASM will no longerhave access to the grid disks. So before you shut down the cell, make sure that shutting it down will notimpact the availability of the disk groups it supports. This is the same procedure we described in the“Cell Disk Failure” section of this chapter. Once the part is replaced and the cell reboots, the storage cellwill automatically configure the cell disk on the replacement card and, if it was used for flash cache, youwill see your flash cache return to its former size.Cell FailureThere are two main types of cell failure, temporary and permanent. Temporary cell failures can be asharmless as a cell reboot or a power failure. Extended cell failures can also be temporary in nature. Forexample, if a patch installation fails or a component must be replaced, it could take the cell offline forhours or even days. Permanent cell failures are more severe in nature and require the entire cell chassisto be replaced. In either case, if your system is configured properly, there will be no interruption to ASMor your databases. In this section we’ll take a look at what happens when a cell is temporarily offline, andwhat to do if you ever have to replace one.313

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!