01.12.2012 Views

Problem Solving and Troubleshooting in AIX 5L - IBM Redbooks

Problem Solving and Troubleshooting in AIX 5L - IBM Redbooks

Problem Solving and Troubleshooting in AIX 5L - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Symptom Possible cause Refer to<br />

Core dump error log<br />

entries<br />

Mach<strong>in</strong>e fails to boot Hardware or<br />

software<br />

Corrupt boot list Corrupt boot list,<br />

erased boot list, or<br />

list non-existent<br />

device<br />

3 Digit LED start<strong>in</strong>g<br />

0c<br />

1.3 Avoid<strong>in</strong>g problems<br />

The Reliability, Availability, <strong>and</strong> Serviceability (RAS) features of <strong>AIX</strong> are<br />

designed to fulfill many functions. As well as help<strong>in</strong>g you determ<strong>in</strong>e the cause of<br />

a problem once it has actually occurred, such as the diagnostics subsystem,<br />

many of the RAS features are designed to provide <strong>in</strong>formation on potential<br />

problems before they occur. By default, <strong>IBM</strong> ^ pSeries <strong>and</strong> RS/6000<br />

systems are configured to run periodic automatic diagnostic checks. Any errors<br />

or warn<strong>in</strong>gs reported by this system will appear <strong>in</strong> the system error log.<br />

Good system adm<strong>in</strong>istration is not only fix<strong>in</strong>g problems when they occur, but<br />

manag<strong>in</strong>g a system <strong>in</strong> such a way as to m<strong>in</strong>imize the chances of a problem<br />

hav<strong>in</strong>g an impact on the users of the system.<br />

Periodic system ma<strong>in</strong>tenance can help reduce the number of problems<br />

experienced on a mach<strong>in</strong>e. Simple tasks, such as clean<strong>in</strong>g the tape drive as<br />

required, can prevent tape errors. Exam<strong>in</strong><strong>in</strong>g the system error log on a regular<br />

basis can help you spot a potential problem when the related error log entries are<br />

still warn<strong>in</strong>gs rather than errors.<br />

1.3.1 System health check<br />

Software failure Section 2.4, “F<strong>in</strong>d<strong>in</strong>g a core dump” on<br />

page 34<br />

System dump <strong>in</strong><br />

progress<br />

Section 1.2.1, “Boot path flowchart” on<br />

page 10<br />

Section 3.3.1, “Failure to locate a boot<br />

image” on page 49<br />

Section 4.4.2, “Dump status codes” on<br />

page 85<br />

The follow<strong>in</strong>g section lists some simple comm<strong>and</strong>s that can be run on a regular<br />

basis to monitor a system. They will help you to become aware of how the<br />

system is operat<strong>in</strong>g.<br />

► Use the errpt comm<strong>and</strong> to look at a summary error log report. Be on the<br />

lookout for recent additions to the log. Use the errpt -a comm<strong>and</strong> to exam<strong>in</strong>e<br />

any suspicious detailed error log entries.<br />

Chapter 1. <strong>Problem</strong> determ<strong>in</strong>ation <strong>in</strong>troduction 17

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!