25.06.2015 Views

Administering Platform LSF - SAS

Administering Platform LSF - SAS

Administering Platform LSF - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 37<br />

Tuning the Cluster<br />

Comparing LIM load thresholds<br />

To tune LIM load thresholds, compare the output of lsload to the thresholds<br />

reported by lshosts -l.<br />

The lsload and lsmon commands display an asterisk * next to each load<br />

index that exceeds its threshold.<br />

Example For example, consider the following output from lshosts -l and lsload:<br />

% lshosts -l<br />

HOST_NAME: hostD<br />

...<br />

LOAD_THRESHOLDS:<br />

r15s r1m r15m ut pg io ls it tmp swp mem<br />

- 3.5 - - 15 - - - - 2M 1M<br />

HOST_NAME: hostA<br />

...<br />

LOAD_THRESHOLDS:<br />

r15s r1m r15m ut pg io ls it tmp swp mem<br />

- 3.5 - - 15 - - - - 2M 1M<br />

% lsload<br />

HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem<br />

hostD ok 0.0 0.0 0.0 0% 0.0 6 0 30M 32M 10M<br />

hostA busy 1.9 2.1 1.9 47% *69.6 21 0 38M 96M 60M<br />

In this example:<br />

◆ hostD is ok.<br />

◆ hostA is busy—The pg (paging rate) index is 69.6, above the threshold of<br />

15.<br />

If LIM often reports a host as busy<br />

If LIM often reports a host as busy when the CPU utilization and run queue<br />

lengths are relatively low and the system is responding quickly, the most likely<br />

cause is the paging rate threshold. Try raising the pg threshold.<br />

Different operating systems assign subtly different meanings to the paging rate<br />

statistic, so the threshold needs to be set at different levels for different host<br />

types. In particular, HP-UX systems need to be configured with significantly<br />

higher pg values; try starting at a value of 50.<br />

There is a point of diminishing returns. As the paging rate rises, eventually the<br />

system spends too much time waiting for pages and the CPU utilization<br />

decreases. Paging rate is the factor that most directly affects perceived<br />

interactive response. If a system is paging heavily, it feels very slow.<br />

If interactive jobs slow down response<br />

If you find that interactive jobs slow down system response too much while<br />

LIM still reports your host as ok, reduce the CPU run queue lengths (r15s, r1m,<br />

r15m). Likewise, increase CPU run queue lengths if hosts become busy at low<br />

loads.<br />

<strong>Administering</strong> <strong>Platform</strong> <strong>LSF</strong> 485

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!