25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Reserving Memory for Pending Parallel Jobs<br />

Reserving Memory for Pending Parallel Jobs<br />

By default, the rusage string reserves resources for running jobs. Because<br />

resources are not reserved for pending jobs, some memory-intensive jobs<br />

could be pending indefinitely because smaller jobs take the resources<br />

immediately before the larger jobs can start running. The more memory a job<br />

requires, the worse the problem is.<br />

Memory reservation for pending jobs solves this problem by reserving memory<br />

as it becomes available, until the total required memory specified on the<br />

rusage string is accumulated and the job can start. Use memory reservation<br />

for pending jobs if memory-intensive jobs often compete for memory with<br />

smaller jobs in your cluster.<br />

Unlike slot reservation, which only applies to parallel jobs, memory reservation<br />

applies to both sequential and parallel jobs.<br />

Configuring memory reservation for pending parallel jobs<br />

lsb.queues<br />

Use the RESOURCE_RESERVE parameter in lsb.queues to reserve host<br />

memory for pending jobs, as described in “Memory Reservation for Pending<br />

Jobs” on page 272.<br />

Set the RESOURCE_RESERVE parameter in a queue defined in lsb.queues.<br />

The RESOURCE_RESERVE parameter overrides the SLOT_RESERVE parameter.<br />

If both RESOURCE_RESERVE and SLOT_RESERVE are defined in the same<br />

queue, job slot reservation and memory reservation are enabled and an error<br />

is displayed when the cluster is reconfigured. SLOT_RESERVE is ignored.<br />

Backfill on memory may still take place.<br />

The following queue enables both memory reservation and backfill in the same<br />

queue:<br />

Begin Queue<br />

QUEUE_NAME = reservation_backfill<br />

DESCRIPTION = For resource reservation and backfill<br />

PRIORITY = 40<br />

RESOURCE_RESERVE = MAX_RESERVE_TIME[20]<br />

BACKFILL = Y<br />

End Queue<br />

Enabling per-slot memory reservation<br />

By default, memory is reserved for parallel jobs on a per-host basis. For<br />

example, by default, the command:<br />

% bsub -n 4 -R "rusage[mem=500]" -q reservation myjob<br />

requires the job to reserve 500 MB on each host where the job runs.<br />

To enable per-slot memory reservation, define<br />

RESOURCE_RESERVE_PER_SLOT=y in lsb.params. In this example, if perslot<br />

reservation is enabled, the job must reserve 500 MB of memory for each<br />

job slot (4 * 500 = 2 GB) on the host in order to run.<br />

452<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!