25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Job States<br />

Viewing wait<br />

status and wait<br />

reason<br />

Exited jobs<br />

Post-execution states<br />

Viewing postexecution<br />

states<br />

You can switch (bswitch) or migrate (bmig) a chunk job member in WAIT<br />

state to another queue.<br />

Use the bhist -l command to display jobs in WAIT status. Jobs are shown as<br />

Waiting ...<br />

The bjobs -l command does not display a WAIT reason in the list of pending<br />

jobs.<br />

See Chapter 24, “Chunk Job Dispatch” for more information about chunk jobs.<br />

A job might terminate abnormally for various reasons. Job termination can<br />

happen from any state. An abnormally terminated job goes into EXIT state. The<br />

situations where a job terminates abnormally include:<br />

◆ The job is cancelled by its owner or the <strong>LSF</strong> administrator while pending,<br />

or after being dispatched to a host.<br />

◆ The job is not able to be dispatched before it reaches its termination<br />

deadline, and thus is aborted by <strong>LSF</strong>.<br />

◆ The job fails to start successfully. For example, the wrong executable is<br />

specified by the user when the job is submitted.<br />

The job exits with a non-zero exit status.<br />

You can configure hosts so that <strong>LSF</strong> detects an abnormally high rate of job exit<br />

from a host. See “Handling Host-level Job Exceptions” on page 96 for more<br />

information.<br />

Some jobs may not be considered complete until some post-job processing is<br />

performed. For example, a job may need to exit from a post-execution job<br />

script, clean up job files, or transfer job output after the job completes.<br />

The DONE or EXIT job states do not indicate whether post-processing is<br />

complete, so jobs that depend on processing may start prematurely. Use the<br />

post_done and post_err keywords on the bsub -w command to specify job<br />

dependency conditions for job post-processing. The corresponding job states<br />

POST_DONE and POST_ERR indicate the state of the post-processing.<br />

After the job completes, you cannot perform any job control on the postprocessing.<br />

Post-processing exit codes are not reported to <strong>LSF</strong>. The postprocessing<br />

of a repetitive job cannot be longer than the repetition period.<br />

Use the bhist command to display the POST_DONE and POST_ERR states.<br />

The resource usage of post-processing is not included in the job resource<br />

usage.<br />

Chapter 28, “Pre-Execution and Post-Execution Commands” for more<br />

information.<br />

114<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!