25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Submitting and Controlling Chunk Jobs<br />

Controlling chunk jobs<br />

Job controls affect the state of the members of a chunk job. You can perform<br />

the following actions on jobs in a chunk job:<br />

Rerunnable chunk jobs<br />

Checkpointing chunk jobs<br />

Action (Command) Job Effect on Job (State)<br />

State<br />

Suspend (bstop) PEND Removed from chunk (PSUSP)<br />

RUN All jobs in the chunk are suspended<br />

(NRUN -1, NSUSP +1)<br />

USUSP No change<br />

WAIT Removed from chunk (PSUSP)<br />

Kill (bkill) PEND Removed from chunk (NJOBS -1, PEND -1)<br />

RUN Job finishes, next job in the chunk starts if one exists<br />

(NJOBS -1, PEND -1)<br />

USUSP Job finishes, next job in the chunk starts if one exists<br />

(NJOBS -1, PEND -1, SUSP -1, RUN +1)<br />

WAIT Job finishes (NJOBS-1, PEND -1)<br />

Resume<br />

USUSP Entire chunk is resumed (RUN +1, USUSP -1)<br />

(bresume)<br />

Migrate (bmig) WAIT Removed from chunk<br />

Switch queue<br />

(bswitch)<br />

Checkpoint<br />

(bchkpnt)<br />

RUN<br />

WAIT<br />

RUN<br />

Job is removed from the chunk and switched; all<br />

other WAIT jobs are requeued to PEND<br />

Only the WAIT job is removed from the chunk and<br />

switched, and requeued to PEND<br />

Job is checkpointed normally<br />

Modify (bmod) PEND Removed from the chunk to be scheduled later<br />

Migrating jobs with bmig will change the dispatch sequence of the chunk job<br />

members. They will not be redispatched in the order they were originally<br />

submitted.<br />

If the execution host becomes unavailable, rerunnable chunk job members are<br />

removed from the queue and dispatched to a different execution host.<br />

See Chapter 22, “Job Requeue and Job Rerun” for more information about<br />

rerunnable jobs.<br />

Only running chunk jobs can be checkpointed. If bchkpnt -k is used, the job<br />

is also killed after the checkpoint file has been created. If chunk job in WAIT<br />

state is checkpointed, mbatchd rejects the checkpoint request.<br />

See Chapter 23, “Job Checkpoint, Restart, and Migration” for more information<br />

about checkpointing jobs.<br />

328<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!