23.07.2014 Views

Lustre 1.6 Operations Manual

Lustre 1.6 Operations Manual

Lustre 1.6 Operations Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

How can I check if a filesystem is active (the MGS, MDT and OSTs are<br />

all online)?<br />

You can look at /proc/fs/lustre/lov/*/target_obds for "ACTIVE" vs "INACTIVE"<br />

on MDS/clients.<br />

How to reclaim the 5 percent of disk space reserved for root?<br />

If your filesystem normally looks like this:<br />

$ df -h /mnt/lustre<br />

Filesystem Size Used Avail Use% Mounted on<br />

databarn 100G 81G 14G 81% /mnt/lustre<br />

You might be wondering: where did the other 5 percent go? This space is reserved<br />

for the root user.<br />

Currently, all <strong>Lustre</strong> installations run the ext3 filesystem internally on service nodes.<br />

By default, ext3 reserves 5 percent of the disk for the root user.<br />

To reclaim this space for use by all users, run this command on your OSSs:<br />

tune2fs [-m reserved_blocks_percent] [device]<br />

This command takes effect immediately. You do not need to shut down <strong>Lustre</strong><br />

beforehand or restart <strong>Lustre</strong> afterwards.<br />

Why are applications hanging?<br />

The most common cause of hung applications is a timeout. For a timeout involving<br />

an MDS or failover OST, applications attempting to access the disconnected resource<br />

wait until the connection is re-established.<br />

In most cases, applications can be interrupted after a timeout with the KILL, INT,<br />

TERM, QUIT, or ALRM signals. In some cases, for a command which communicates<br />

with multiple services in a single system call, you may have to wait for multiple<br />

timeouts.<br />

Appendix D <strong>Lustre</strong> Knowledge Base D-3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!