25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 15<br />

Goal-Oriented SLA-Driven Scheduling<br />

How service classes perform goal-oriented scheduling<br />

Optimum number<br />

of running jobs<br />

Goal-oriented scheduling makes use of other, lower level <strong>LSF</strong> policies like<br />

queues and host partitions to satisfy the service-level goal that the service class<br />

expresses. The decisions of a service class are considered first before any<br />

queue or host partition decisions. Limits are still enforced with respect to lower<br />

level scheduling objects like queues, hosts, and users.<br />

As jobs are submitted, <strong>LSF</strong> determines the optimum number of job slots (or<br />

concurrently running jobs) needed for the service class to meet its service-level<br />

goals. <strong>LSF</strong> schedules a number of jobs at least equal to the optimum number<br />

of slots calculated for the service class.<br />

<strong>LSF</strong> attempts to meet SLA goals in the most efficient way, using the optimum<br />

number of job slots so that other service classes or other types of work in the<br />

cluster can still progress. For example, in a service class that defines a deadline<br />

goal, <strong>LSF</strong> spreads out the work over the entire time window for the goal, which<br />

avoids blocking other work by not allocating as many slots as possible at the<br />

beginning to finish earlier than the deadline.<br />

Submitting jobs to a service class<br />

Use the bsub -sla service_class_name to submit a job to a service class for<br />

SLA-driven scheduling.<br />

You submit jobs to a service class as you would to a queue, except that a<br />

service class is a higher level scheduling policy that makes use of other, lower<br />

level <strong>LSF</strong> policies like queues and host partitions to satisfy the service-level<br />

goal that the service class expresses.<br />

For example:<br />

% bsub -W 15 -sla Kyuquot sleep 100<br />

submits the UNIX command sleep together with its argument 100 as a job to<br />

the service class named Kyuquot.<br />

The service class name where the job is to run is configured in<br />

lsb.serviceclasses. If the SLA does not exist or the user is not a member<br />

of the service class, the job is rejected.<br />

Outside of the configured time windows, the SLA is not active, and <strong>LSF</strong><br />

schedules jobs without enforcing any service-level goals. Jobs will flow<br />

through queues following queue priorities even if they are submitted with<br />

-sla.<br />

Submit with run<br />

limit<br />

-sla and -g options<br />

You should submit your jobs with a run time limit (-W option) or the queue should<br />

specify a run time limit (RUNLIMIT in the queue definition in lsb.queues). If you do<br />

not specify a run time limit, <strong>LSF</strong> automatically adjusts the optimum number of<br />

running jobs according to the observed run time of finished jobs.<br />

You cannot use the -g option with -sla. A job can either be attached to a job<br />

group or a service class, but not both.<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong> 241

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!