25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

External Load Indices and ELIM<br />

External Load Indices and ELIM<br />

The <strong>LSF</strong> Load Information Manager (LIM) collects built-in load indices that<br />

reflect the load situations of CPU, memory, disk space, I/O, and interactive<br />

activities on individual hosts.<br />

While built-in load indices might be sufficient for most jobs, you might have<br />

special workload or resource dependencies that require custom external load<br />

indices defined and configured by the <strong>LSF</strong> administrator. Load and shared<br />

resource information from external load indices, are used the same as built in<br />

load indices for job scheduling and host selection.<br />

You can write an External Load Information Manager (ELIM) program that<br />

collects the values of configured external load indices and updates LIM when<br />

new values are received.<br />

An ELIM can be as simple as a small script, or as complicated as a sophisticated<br />

C program. A well-defined protocol allows the ELIM to talk to LIM.<br />

The ELIM executable must be located in <strong>LSF</strong>_SERVERDIR.<br />

◆ “How <strong>LSF</strong> supports multiple ELIMs” on page 158<br />

◆ “Configuring your application-specific SELIM” on page 159<br />

◆ “How <strong>LSF</strong> uses ELIM for external resource collection” on page 159<br />

◆ “Writing an ELIM” on page 160<br />

◆ “Debugging an ELIM” on page 162<br />

How <strong>LSF</strong> supports multiple ELIMs<br />

Master ELIM<br />

(melim)<br />

ELIM failure<br />

Error logging<br />

To increase LIM reliability, <strong>LSF</strong> Version 6.0 supports the configuration of<br />

multiple ELIM executables.<br />

A master ELIM (melim) is installed in <strong>LSF</strong>_SERVERDIR.<br />

melim manages multiple site-defined sub-ELIMs (SELIMs) and reports external<br />

load information to LIM. melim does the following:<br />

◆ Starts and stops SELIMs<br />

◆ Checks syntax of load information reporting on behalf of LIM<br />

◆ Collects load information reported from SELIMs<br />

◆ Merges latest valid load reports from each SELIM and sends merged load<br />

information back to LIM<br />

Multiple slave ELIMs managed by a master ELIM increases reliability by<br />

protecting LIM:<br />

◆ ELIM output is buffered<br />

◆ Incorrect resource format or values are checked by ELIM<br />

◆ SELIMs are independent of each other; one SELIM hanging while waiting<br />

for load information does not affect the other SELIMs<br />

MELIM logs its own activities and data into the log file<br />

<strong>LSF</strong>_LOGDIR/melim.log.host_name.<br />

158<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!