OpenVMS Cluster Systems - OpenVMS Systems - HP

More documents

Recommendations

Info

OpenVMS Cluster Concepts 2.4 State Transitions Table 2–3 Transitions Caused by Loss of a Cluster Member Cause Description Failure detection The duration of this phase depends on the cause of the failure and on how the failure is detected. During normal cluster operation, messages sent from one computer to another are acknowledged when received. IF... THEN... A message is not acknowledged within a period determined by OpenVMS Cluster communications software A cluster member is shut down or fails The repair attempt phase begins. The operating system causes datagrams to be sent from the computer shutting down to the other members. These datagrams state the computer’s intention to sever communications and to stop sharing resources. The failure detection and repair attempt phases are bypassed, and the reconfiguration phase begins immediately. Repair attempt If the virtual circuit to an OpenVMS Cluster member is broken, attempts are made to repair the path. Repair attempts continue for an interval specified by the PAPOLLINTERVAL system parameter. (System managers can adjust the value of this parameter to suit local conditions.) Thereafter, the path is considered irrevocably broken, and steps must be taken to reconfigure the OpenVMS Cluster system so that all computers can once again communicate with each other and so that computers that cannot communicate are removed from the OpenVMS Cluster. Reconfiguration If a cluster member is shut down or fails, the cluster must be reconfigured. One of the remaining computers acts as coordinator and exchanges messages with all other cluster members to determine an optimal cluster configuration with the most members and the most votes. This phase, during which all user (application) activity is blocked, usually lasts less than 3 seconds, although the actual time depends on the configuration. (continued on next page) 2–10 OpenVMS Cluster Concepts
Table 2–3 (Cont.) Transitions Caused by Loss of a Cluster Member Cause Description OpenVMS Cluster system recovery Application recovery OpenVMS Cluster Concepts 2.4 State Transitions Recovery includes the following stages, some of which can take place in parallel: Stage Action I/O completion When a computer is removed from the cluster, OpenVMS Cluster software ensures that all I/O operations that are started prior to the transition complete before I/O operations that are generated after the transition. This stage usually has little or no effect on applications. Lock database rebuild Disk mount verification Quorum disk votes validation Because the lock database is distributed among all members, some portion of the database might need rebuilding. A rebuild is performed as follows: WHEN... THEN... A computer leaves the OpenVMS Cluster A computer is added to the OpenVMS Cluster A rebuild is always performed. A rebuild is performed when the LOCKDIRWT system parameter is greater than 1. Caution: Setting the LOCKDIRWT system parameter to different values on the same model or type of computer can cause the distributed lock manager to use the computer with the higher value. This could cause undue resource usage on that computer. This stage occurs only when the failure of a voting member causes quorum to be lost. To protect data integrity, all I/O activity is blocked until quorum is regained. Mount verification is the mechanism used to block I/O during this phase. If, when a computer is removed, the remaining members can determine that it has shut down or failed, the votes contributed by the quorum disk are included without delay in quorum calculations that are performed by the remaining members. However, if the quorum watcher cannot determine that the computer has shut down or failed (for example, if a console halt, power failure, or communications failure has occurred), the votes are not included for a period (in seconds) equal to four times the value of the QDSKINTERVAL system parameter. This period is sufficient to determine that the failed computer is no longer using the quorum disk. Disk rebuild If the transition is the result of a computer rebooting after a failure, the disks are marked as improperly dismounted. Reference: See Sections 6.5.5 and 6.5.6 for information about rebuilding disks. When you assess the effect of a state transition on application users, consider that the application recovery phase includes activities such as replaying a journal file, cleaning up recovery units, and users logging in again. 2.5 OpenVMS Cluster Membership OpenVMS Cluster systems based on LAN use a cluster group number and a cluster password to allow multiple independent OpenVMS Cluster systems to coexist on the same extended LAN and to prevent accidental access to a cluster by unauthorized computers. OpenVMS Cluster Concepts 2–11
Page 1 and 2: OpenVMS Cluster Systems Order Numbe
Page 3 and 4: Contents Preface ..................
Page 5 and 6: 4 The OpenVMS Cluster Operating Env
Page 7 and 8: 6.5.4 Disk Rebuild Operation ......
Page 9 and 10: 10 Maintaining an OpenVMS Cluster S
Page 11 and 12: D Sample Programs for LAN Control D
Page 13 and 14: G NISCA Transport Protocol Channel
Page 15 and 16: Tables 7-3 Clusterwide Generic Prin
Page 17: F-6 Channel Formation .............
Page 20 and 21: Appendix C provides troubleshooting
Page 22 and 23: xxii bold text This typeface repres
Page 24 and 25: Introduction to OpenVMS Cluster Sys
Page 36 and 37: OpenVMS Cluster Concepts 2.1 OpenVM
Page 40 and 41: OpenVMS Cluster Concepts 2.3 Ensuri
Page 42 and 43: OpenVMS Cluster Concepts 2.3 Ensuri
Page 48 and 49: OpenVMS Cluster Concepts 2.6 Synchr
Page 50 and 51: OpenVMS Cluster Concepts 2.8 Disk A
Page 53 and 54: 3 OpenVMS Cluster Interconnect Conf
Page 55 and 56: OpenVMS Cluster Interconnect Config
Page 67 and 68: 4 The OpenVMS Cluster Operating Env
Page 69 and 70: Table 4-1 Information Required to P
Page 71 and 72: Table 4-1 (Cont.) Information Requi
Page 73 and 74: Table 4-2 Installing Layered Produc
Page 75 and 76: The OpenVMS Cluster Operating Envir
Page 83 and 84: 5 Preparing a Shared Environment In
Page 85 and 86: Preparing a Shared Environment 5.2
Page 91 and 92: Table 5-2 Alias Collisions and Outc
Page 95 and 96:
Preparing a Shared Environment 5.5
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Table 5-3 (Cont.) Security Files Fi
Page 103 and 104:
Table 5-3 (Cont.) Security Files Fi
Page 105 and 106:
Page 107 and 108:
6 Cluster Storage Devices One of th
Page 109 and 110:
Figure 6-1 Dual-Ported Disks Networ
Page 111 and 112:
Figure 6-3 Configuration with Clust
Page 113 and 114:
Cluster Storage Devices 6.2 Naming
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Cluster Storage Devices 6.3 MSCP an
Page 129 and 130:
Cluster Storage Devices 6.4 MSCP I/
Page 131 and 132:
Cluster Storage Devices 6.5 Managin
Page 133 and 134:
Cluster Storage Devices 6.5 Managin
Page 135 and 136:
Cluster Storage Devices 6.6 Shadowi
Page 137 and 138:
7.1 Introduction 7 Setting Up and M
Page 139 and 140:
Setting Up and Managing Cluster Que
Page 141 and 142:
Figure 7-1 Sample Printer Configura
Page 143 and 144:
Figure 7-2 Print Queue Configuratio
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153:
Page 156 and 157:
Configuring an OpenVMS Cluster Syst
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
Page 166 and 167:
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
Page 178 and 179:
Page 180 and 181:
Page 182 and 183:
Page 184 and 185:
Page 186 and 187:
Page 188 and 189:
Page 190 and 191:
Page 192 and 193:
Page 194 and 195:
Page 196 and 197:
Page 199 and 200:
9 Building Large OpenVMS Cluster Sy
Page 201 and 202:
Building Large OpenVMS Cluster Syst
Page 203 and 204:
Table 9-3 (Cont.) Checklist for Sat
Page 205 and 206:
Page 207 and 208:
Page 209 and 210:
Table 9-8 (Cont.) Controlling Satel
Page 211 and 212:
Table 9-8 (Cont.) Controlling Satel
Page 213 and 214:
Potential Hot Files Methods to Help
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
10 Maintaining an OpenVMS Cluster S
Page 221 and 222:
Maintaining an OpenVMS Cluster Syst
Page 223 and 224:
Example 10-1 Sample NETNODE_UPDATE.
Page 225 and 226:
Page 227 and 228:
Page 229 and 230:
• CLUSTER_SHUTDOWN • REBOOT_CHE
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
10.12 Restoring Cluster Quorum Main
Page 239 and 240:
Command Purpose Maintaining an Open
Page 241:
Page 244 and 245:
Cluster System Parameters A.1 Value
Page 246 and 247:
Page 248 and 249:
Page 250 and 251:
Page 252 and 253:
Page 254 and 255:
Page 256 and 257:
Page 258 and 259:
Building Common Files B.1 Building
Page 261 and 262:
C.1 Diagnosing Computer Failures C
Page 263 and 264:
Table C-1 (Cont.) Sequence of Booti
Page 265 and 266:
Step Action Cluster Troubleshooting
Page 267 and 268:
Page 269 and 270:
Cluster Troubleshooting C.3 Satelli
Page 271 and 272:
Page 273 and 274:
Table C-2 (Cont.) Alpha Booting Mes
Page 275 and 276:
IF... THEN... The startup procedure
Page 277 and 278:
Possible Bugcheck Causes Recommenda
Page 279 and 280:
Cluster Troubleshooting C.10 Diagno
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Table C-5 Informational and Other E
Page 287 and 288:
Entry Description Cluster Troublesh
Page 289 and 290:
Page 291 and 292:
Page 293 and 294:
Table C-6 (Cont.) Port Messages for
Page 295 and 296:
Page 297 and 298:
Page 299 and 300:
Table C-8 (Cont.) OPA0 Messages Har
Page 301 and 302:
D Sample Programs for LAN Control S
Page 303 and 304:
Sample Programs for LAN Control D.3
Page 305 and 306:
Page 307 and 308:
Page 309 and 310:
Location Action Sample Programs for
Page 311:
Page 314 and 315:
Subroutines for LAN Control E.2 Sta
Page 316 and 317:
Subroutines for LAN Control E.3 Sto
Page 318 and 319:
Subroutines for LAN Control E.4 Cre
Page 320 and 321:
Subroutines for LAN Control E.5 Cre
Page 322 and 323:
Subroutines for LAN Control E.7 Sto
Page 324 and 325:
Troubleshooting the NISCA Protocol
Page 326 and 327:
Page 328 and 329:
Page 330 and 331:
Page 332 and 333:
Page 334 and 335:
Page 336 and 337:
Page 338 and 339:
Page 340 and 341:
Page 342 and 343:
Page 344 and 345:
Page 346 and 347:
Page 348 and 349:
Page 350 and 351:
Page 352 and 353:
Page 354 and 355:
Page 356 and 357:
Page 359 and 360:
G NISCA Transport Protocol Channel
Page 361 and 362:
NISCA Transport Protocol Channel Se
Page 363 and 364:
NISCA Transport Protocol Channel Se
Page 365 and 366:
A Access control lists See ACLs ACL
Page 367 and 368:
CLUSTER_AUTHORIZE.DAT files (cont
Page 369 and 370:
Disks system (cont’d) creating du
Page 371 and 372:
HSZ subsystems, 1-4 I Installation
Page 373 and 374:
MODPARAMS.DAT file (cont’d) examp
Page 375 and 376:
Packets capturing data, F-26 maximu
Page 377 and 378:
SAVE_FEEDBACK option, 10-10 SCA (Sy
Page 379 and 380:
System parameters (cont’d) MPDEV_
Page 381:
Voting members, 2-6 adding, 8-10, 8
show all

OpenVMS Cluster Systems - OpenVMS Systems - HP

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?