24.05.2014 Views

AIX Version 4.3 Differences Guide

AIX Version 4.3 Differences Guide

AIX Version 4.3 Differences Guide

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.10 System Resource Controller Subsystem Enhancements (<strong>4.3</strong>.2)<br />

Two major enhancements have been introduced to the System Resource<br />

Controller (SRC) subsystem in <strong>AIX</strong> <strong>4.3</strong>.2. They are aimed at increasing the<br />

reliability and scalability of both the various subsystems that are controlled by<br />

SRC and the SRC itself. The following sections explain these enhancements:<br />

6.10.1 Recoverable SRC Daemon<br />

The SRC is a subsystem controller that facilitates the management and control of<br />

complex subsystems. The SRC provides a single set of commands to start, stop,<br />

trace, refresh, and query the status of a subsystem. If a subsystem should fail for<br />

any reason, the SRC can automatically restart it.<br />

If the SRC itself were to fail for any reason, it would be restarted due to its entry<br />

in /etc/inittab, as shown in the following example:<br />

srcmstr:2:respawn:/usr/sbin/srcmstr # System Resource Controller<br />

The respawned SRC is, however, unable to control or monitor the subsystems<br />

started by the previous instance of SRC since they will have been inherited by the<br />

init process when the original SRC terminated. As a result, the lssrc command<br />

will show such a subsystem as inoperative, even though it is still running. In<br />

addition, the startsrc command can be used to start a second instance of the<br />

subsystem, even though the subsystem definition explicitly forbids multiple<br />

instances.<br />

The SRC in <strong>AIX</strong> <strong>4.3</strong>.2 has been enhanced to allow a respawned srcmstr daemon<br />

to monitor and control the subsystems started by the previous instance of the<br />

daemon. This has been achieved using the following enhancements.<br />

The SRC now keeps an external list of the subsystems under its control in the file<br />

/var/adm/SRC/active_list. This file is for use by the SRC system only, therefore<br />

the format is unpublished. A respawned srcmstr daemon will read the contents of<br />

this file to update its internal list of the currently running subsystems. This will<br />

allow the lssrc command to correctly determine the status of the running<br />

subsystems, even though they were not necessarily started by the current<br />

instance of the srcmstr daemon.<br />

A respawned srcmstr daemon uses a new kernel extension to register interest in<br />

the termination of certain processes. This allows a respawned srcmstr daemon to<br />

be informed of the termination of subsystems started by the previous instance of<br />

the daemon. A child process is created to communicate with the kernel extension.<br />

The child process in turn communicates with the srcmstr daemon. The presence<br />

of the child process, called srcd, indicates that the srcmstr daemon has been<br />

restarted.<br />

# ps -ef | grep src<br />

root 4650 1 0 Aug 21 - 0:00 /usr/sbin/srcmstr<br />

root 24680 4650 0 Aug 21 - 0:00 srcd<br />

root 25894 7030 2 10:03:38 pts/1 0:00 grep src<br />

#<br />

If a subsystem fails for any reason while under the control of a respawned srcmstr<br />

daemon, it will be restarted if the subsystem policy requires it. In this case the exit<br />

code of the subsystem is not available to SRC due to the method used by the<br />

132 <strong>AIX</strong> <strong>Version</strong> <strong>4.3</strong> <strong>Differences</strong> <strong>Guide</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!