02.12.2012 Views

OpenVMS Cluster Systems - OpenVMS Systems - HP

OpenVMS Cluster Systems - OpenVMS Systems - HP

OpenVMS Cluster Systems - OpenVMS Systems - HP

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Entry Description<br />

<strong>Cluster</strong> Troubleshooting<br />

C.11 Analyzing Error-Log Entries for Port Devices<br />

# The next two lines contain the entry type, the processor type (KA780), and the computer’s<br />

SCS node name.<br />

$ This line shows the name of the subsystem and the device that caused the entry and the<br />

reason for the entry. The CI subsystem’s device PAA0 on MARS was powered down.<br />

The next 15 lines contain the names of hardware registers in the port, their contents, and<br />

interpretations of those contents. See the appropriate CI hardware manual for a description<br />

of all the CI port registers.<br />

% The UCB$B_ERTCNT field contains the number of reinitializations that the port driver can<br />

still attempt. The difference between this value and UCB$B_ERTMAX is the number of<br />

reinitializations already attempted.<br />

& The UCB$B_ERTMAX field contains the maximum number of times the port can be<br />

reinitialized by the port driver.<br />

’ The UCB$W_ERRCNT field contains the total number of errors that have occurred on this<br />

port since it was booted. This total includes both errors that caused reinitialization of the<br />

port and errors that did not.<br />

C.11.4 Error Recovery<br />

The CI port can recover from many errors, but not all. When an error occurs from<br />

which the CI cannot recover, the following process occurs:<br />

Step Action<br />

1 The port notifies the port driver.<br />

2 The port driver logs the error and attempts to reinitialize the port.<br />

3 If the port fails after 50 such initialization attempts, the driver takes it off line, unless the<br />

system disk is connected to the failing port or unless this computer is supposed to be a<br />

cluster member.<br />

4 If the CI port is required for system disk access or cluster participation and all 50<br />

reinitialization attempts have been used, then the computer bugchecks with a CIPORT-type<br />

bugcheck.<br />

Once a CI port is off line, you can put the port back on line only by rebooting the<br />

computer.<br />

C.11.5 LAN Device-Attention Entries<br />

Example C–7 shows device-attention entries for the LAN. The left column gives<br />

the name of a device register or a memory location. The center column gives<br />

the value contained in that register or location, and the right column gives an<br />

interpretation of that value.<br />

Example C–7 LAN Device-Attention Entry<br />

************************* ENTRY 80. **************************** !<br />

ERROR SEQUENCE 26. LOGGED ON: SID 08000000<br />

DATE/TIME 15-JAN-1994 11:30:53.07 SYS_TYPE 01010000 "<br />

DEVICE ATTENTION KA630 #<br />

SCS NODE: PHOBOS<br />

NI-SCS SUB-SYSTEM, PHOBOS$PEA0: $<br />

FATAL ERROR DETECTED BY DATALINK %<br />

(continued on next page)<br />

<strong>Cluster</strong> Troubleshooting C–27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!