25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Suspending Conditions<br />

Suspending Conditions<br />

<strong>LSF</strong> provides different alternatives for configuring suspending conditions.<br />

Suspending conditions are configured at the host level as load thresholds,<br />

whereas suspending conditions are configured at the queue level as either load<br />

thresholds, or by using the STOP_COND parameter in the lsb.queues file, or<br />

both.<br />

The load indices most commonly used for suspending conditions are the CPU<br />

run queue lengths (r15s, r1m, and r15m), paging rate (pg), and idle time (it).<br />

The (swp) and (tmp) indices are also considered for suspending jobs.<br />

To give priority to interactive users, set the suspending threshold on the it<br />

(idle time) load index to a non-zero value. Jobs are stopped when any user is<br />

active, and resumed when the host has been idle for the time given in the it<br />

scheduling condition.<br />

To tune the suspending threshold for paging rate, it is desirable to know the<br />

behaviour of your application. On an otherwise idle machine, check the<br />

paging rate using lsload, and then start your application. Watch the paging<br />

rate as the application runs. By subtracting the active paging rate from the idle<br />

paging rate, you get a number for the paging rate of your application. The<br />

suspending threshold should allow at least 1.5 times that amount. A job can be<br />

scheduled at any paging rate up to the scheduling threshold, so the suspending<br />

threshold should be at least the scheduling threshold plus 1.5 times the<br />

application paging rate. This prevents the system from scheduling a job and<br />

then immediately suspending it because of its own paging.<br />

The effective CPU run queue length condition should be configured like the<br />

paging rate. For CPU-intensive sequential jobs, the effective run queue length<br />

indices increase by approximately one for each job. For jobs that use more than<br />

one process, you should make some test runs to determine your job’s effect on<br />

the run queue length indices. Again, the suspending threshold should be equal<br />

to at least the scheduling threshold plus 1.5 times the load for one job.<br />

Configuring load thresholds at queue level<br />

Syntax<br />

The queue definition (lsb.queues) can contain thresholds for 0 or more of<br />

the load indices. Any load index that does not have a configured threshold has<br />

no effect on job scheduling.<br />

Each load index is configured on a separate line with the format:<br />

load_index = loadSched/loadStop<br />

Specify the name of the load index, for example r1m for the 1-minute CPU run<br />

queue length or pg for the paging rate. loadSched is the scheduling threshold<br />

for this load index. loadStop is the suspending threshold. The loadSched<br />

condition must be satisfied by a host before a job is dispatched to it and also<br />

before a job suspended on a host can be resumed. If the loadStop condition<br />

is satisfied, a job is suspended.<br />

The loadSched and loadStop thresholds permit the specification of<br />

conditions using simple AND/OR logic. For example, the specification:<br />

362<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!