25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Allowing Jobs to Use Reserved Job Slots<br />

In this scenario, assume the cluster consists of a 4-CPU multiprocessor host.<br />

1 A sequential job (job1) with a run limit of 2 hours is submitted and gets<br />

started at 8:00 am (figure a).<br />

2 Shortly afterwards, a parallel job (job2) requiring all 4 CPUs is submitted.<br />

It cannot start right away because job1 is using one CPU, so it reserves the<br />

remaining 3 processors (figure b).<br />

3 At 8:30 am, another parallel job (job3) is submitted requiring only two<br />

processors and with a run limit of 1 hour. Since job2 cannot start until<br />

10:00am (when job1 finishes), its reserved processors can be backfilled by<br />

job3 (figure c). Therefore job3 can complete before job2's start time,<br />

making use of the idle processors.<br />

4 Job3 will finish at 9:30am and job1 at 10:00am, allowing job2 to start<br />

shortly after 10:00am.<br />

In this example, if job3's run limit was 2 hours, it would not be able to backfill<br />

job2's reserved slots, and would have to run after job2 finishes.<br />

Limitations ◆ A job will not have an estimated start time immediately after mbatchd is<br />

reconfigured.<br />

◆ Jobs in a backfill queue cannot be preempted (a job in a backfill queue<br />

might be running in a reserved job slot, and starting a new job in that slot<br />

might delay the start of the big parallel job):<br />

❖ A backfill queue cannot be preemptable.<br />

❖ A preemptive queue whose priority is higher than the backfill queue<br />

cannot preempt the jobs in backfill queue.<br />

Backfilling and job<br />

slot limits<br />

A backfill job borrows a job slot that is already taken by another job. The<br />

backfill job will not run at the same time as the job that reserved the job slot<br />

first. Backfilling can take place even if the job slot limits for a host or processor<br />

have been reached. Backfilling cannot take place if the job slot limits for users<br />

or queues have been reached.<br />

Configuring backfill scheduling<br />

Configuring a<br />

backfill queue<br />

Example<br />

Backfill scheduling is enabled at the queue level. Only jobs in a backfill queue<br />

can backfill reserved job slots. If the backfill queue also allows processor<br />

reservation, then backfilling can occur among jobs within the same queue.<br />

To configure a backfill queue, define BACKFILL in lsb.queues.<br />

Specify Y to enable backfilling. To disable backfilling, specify N or blank space.<br />

BACKFILL=Y<br />

Enforcing run limits<br />

Backfill scheduling works most efficiently when all the jobs in a cluster have a<br />

run limit specified at the job level (bsub -W). You can use the external<br />

submission executable, esub, to make sure that all users specify a job-level run<br />

limit.<br />

Otherwise, you can specify ceiling and default run limits at the queue level<br />

(RUNLIMIT in lsb.queues ).<br />

454<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!