10.07.2015 Views

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 9 RECOVERING EXADATA-rw-r----- 1 root root 3555 Jan 21 21:07 checkconfigs.log-rw-r----- 1 root root 1612 Dec 31 12:37 checkdeveachboot.log-rw-r----- 1 root root 257 Dec 17 11:41 createcell.log-rw-r----- 1 root root 1233 Dec 31 12:37 ipmisettings.log-rw-r----- 1 root root 3788 Dec 31 12:37 misceachboot.log-rw-r----- 1 root root 13300 Dec 17 11:26 misczeroboot.log-rw-r----- 1 root root 228 Dec 31 12:37 oswatcher.log-rw-r----- 1 root root 34453 Dec 17 11:26 postinstall.log-rw-r----- 1 root root 4132 Dec 31 12:38 saveconfig.log-rw-r----- 1 root root 65 Jan 21 12:49 sosreport.log-rw-r----- 1 root root 75 Dec 17 11:26 syscheck.log-rw-r----- 1 root root 70873 Dec 17 11:44 upgradecbusb.logAs discussed in the “Active and Inactive System Volumes” section of this chapter, <strong>Exadata</strong> storagecells maintain two sets of system volumes, which contain the Linux operating system and storage cellsoftware. By maintaining separate Active and Inactive system images, <strong>Exadata</strong> ensures that failed out-ofpartitionupgrades do not cause an outage to the databases. If the validation tests detect a problem afteran out-of-partition upgrade, <strong>Exadata</strong> will automatically fail back to the last good configuration byswitching the Active and Inactive system volumes. For in-partition patches, <strong>Exadata</strong> will reapply all thesettings and files changed by the patch from online backups. Following the first boot-up after installing apatch or upgrade the validation results can be found in the log file,/var/log/cellos/vldrun.first_boot.log. Validation tests will be logged to the/var/log/cellos/validations.log file for all subsequent reboots. The patch rollback procedure can beperformed manually, but there is no mention of it in the documentation, so it is probably not something<strong>Oracle</strong> expects administrators to run without the help of <strong>Oracle</strong> Support.Cell Disk FailureASM handles the temporary or permanent loss of a cell disk through its redundant failure grouptechnology. So the loss of a cell disk should not cause any interruption to the databases as long as thedisk group is defined with Normal redundancy. If High redundancy is used, the disk group can suffer thesimultaneous loss of two cell disks within the same failure group. Recall that on <strong>Exadata</strong>, each storagecell constitutes a separate failure group. This means that with Normal redundancy, you can lose anentire storage cell (12 cell disks) without impact to your databases. With High redundancy you can losetwo storage cells simultaneously and your databases will continue to service your clients withoutinterruption. That’s pretty impressive. Redundancy isn’t cheap, though. For example, consider a diskgroup with 30 terabytes of usable space (configured for External redundancy). With Normal redundancy,that 30 terabytes becomes 15 terabytes of usable space. With High redundancy, it becomes 10 terabytesof usable storage. Also keep in mind that the database will always read the primary copy of your dataunless it is unavailable. Normal and High redundancy provide no performance benefits. They are usedstrictly for fault tolerance. The key is to choose a redundancy level that strikes a balance betweenresiliency and budget.Simulated Disk FailureIn this section we’re going to test what happens when a cell disk fails. The system used for these testswas a quarter rack, <strong>Exadata</strong> V2. We’ve created a disk group called SCRATCH_DG, defined as follows:302

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!