12.01.2013 Views

Problem Determination Guide - Systems Group

Problem Determination Guide - Systems Group

Problem Determination Guide - Systems Group

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

information. It is critical to monitor RAS information as an indicator of system<br />

health. For more information about monitoring RAS, see 3.2, “Hardware<br />

monitor” on page 114 and “RAS events” on page 126.<br />

► Diagnostic data: Contains the results from diagnostic tests on the hardware.<br />

1.4.2 Service Node system processes<br />

The following paragraphs describe Blue Gene/L system processes that run on<br />

the Service Node.<br />

mmcs_db_server<br />

The Midplane Management Control System (MMCS) server process is<br />

responsible for the management of blocks. Blocks are partitions (sets of compute<br />

and I/O nodes) of the Blue Gene/L in which jobs run. The mmcs_db_server<br />

process configures blocks at boot time and identifies what physical hardware<br />

should be used and in what configuration. It also polls the database for block<br />

actions and starts the boot processes.<br />

ciodb<br />

After the blocks are booted, ciodb manages the job launch to the block. It then<br />

handles passing back stdin, stdout, and stderr for each job. The ciodb daemon<br />

talks to the ciod process running on the I/O node.<br />

idoproxydb<br />

The idoproxydb daemon handles hardware related communication<br />

communication through the Service Network. It communicates with the IDo chips<br />

located on the Service, Link, and Node Cards.<br />

bglmaster<br />

The bglmaster process is the parent process for the other three system<br />

processes. It starts all three of the main system processes (idoproxy,<br />

mmcs-db_server, and ciodb) and restarts them if a process is ended for any<br />

reason. It can also provide information about the latest status of the spawned<br />

processes.<br />

Additional software<br />

For the Service Node to function, additional software is required. For more<br />

information, see Unfolding the IBM eServer Blue Gene Solution, SG24-6686.<br />

Chapter 1. Introduction 31

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!