25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Controlling mbatchd<br />

Controlling mbatchd<br />

Restarting mbatchd<br />

When you reconfigure the cluster with the command badmin reconfig,<br />

mbatchd is not restarted. Only configuration files are reloaded.<br />

If you add a host to a host group, or a host to a queue, the new host is not<br />

recognized by jobs that were submitted before you reconfigured. If you want<br />

the new host to be recognized, you must restart mbatchd.<br />

Run badmin mbdrestart. <strong>LSF</strong> checks configuration files for errors and prints<br />

the results to stderr. If no errors are found, the following occurs:<br />

◆ Configuration files are reloaded.<br />

◆ mbatchd is restarted.<br />

◆ Events in lsb.events are reread and replayed to recover the running state<br />

of the last mbatchd.<br />

Whenever mbatchd is restarted, it is unavailable to service requests. In large<br />

clusters where there are many events in lsb.events, restarting mbatchd can<br />

take some time. To avoid replaying events in lsb.events, use the command<br />

badmin reconfig.<br />

Logging a comment when restarting mbatchd<br />

Use the -C option of badmin mbdrestart to log an administrator comment in<br />

lsb.events. For example,<br />

% badmin mbdrestart -C "Configuration change"<br />

The comment text Configuration change is recorded in lsb.events.<br />

Use badmin hist or badmin mbdhist to display administrator comments for<br />

mbatchd restart.<br />

Shutting down mbatchd<br />

1 Run badmin hshutdown to shut down sbatchd on the master host. For<br />

example:<br />

% badmin hshutdown hostD<br />

Shut down slave batch daemon on .... done<br />

2 Run badmin mbdrestart:<br />

% badmin mbdrestart<br />

Checking configuration files ...<br />

No errors found.<br />

This causes mbatchd and mbschd to exit. mbatchd cannot be restarted,<br />

because sbatchd is shut down. All <strong>LSF</strong> services are temporarily<br />

unavailable, but existing jobs are not affected. When mbatchd is later<br />

started by sbatchd, its previous status is restored from the event log file<br />

and job scheduling continues.<br />

64<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!