25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Checkpointing a Job<br />

Prerequisites<br />

Chapter 23<br />

Job Checkpoint, Restart, and Migration<br />

Before <strong>LSF</strong> can checkpoint a job, it must be made checkpointable. <strong>LSF</strong> provides<br />

automatic and manual controls to make jobs checkpointable and to checkpoint<br />

jobs. When working with checkpointable jobs, a checkpoint directory must<br />

always be specified. Optionally, a checkpoint period can be specified to enable<br />

periodic checkpointing.<br />

When a job is checkpointed, <strong>LSF</strong> performs the following actions:<br />

1 Stops the job if its running<br />

2 Creates the checkpoint file in the checkpoint directory<br />

3 Restarts the job<br />

<strong>LSF</strong> can create a checkpoint for any eligible job. Review the discussion about<br />

“Approaches to Checkpointing” on page 309 to determine if your application<br />

and environment are suitable for checkpointing.<br />

In this section ◆ “The Checkpoint Directory” on page 314<br />

◆ “Making Jobs Checkpointable” on page 315<br />

◆ “Manually Checkpointing Jobs” on page 316<br />

◆ “Enabling Periodic Checkpointing” on page 317<br />

◆ “Automatically Checkpointing Jobs” on page 318<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong> 313

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!