28.07.2013 Views

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 2. BACKGROUND AND RELATED WORK 12<br />

when compared against general purpose time-sharing, server, <strong>and</strong> workstation environments. Although large<br />

I/O requests are common, small requests are fairly common as well. They believe this is a natural result<br />

<strong>of</strong> parallelization <strong>and</strong> is an inherent characteristic in most parallel programs. Therefore, file systems for<br />

<strong>the</strong>se computers must provide low latency to small requests <strong>and</strong> high b<strong>and</strong>width to large requests. They<br />

concluded that a distributed file system (such as NFS or AFS) would not provide adequate performance<br />

since it is designed for completely different workload characteristics.<br />

2.5.3 <strong>File</strong> <strong>System</strong> Benchmarking<br />

A classic syn<strong>the</strong>tic benchmark is <strong>the</strong> Andrew benchmark [27] that was originally designed to measure <strong>the</strong><br />

performance scalability <strong>of</strong> <strong>the</strong> Andrew <strong>File</strong> <strong>System</strong>. It claims to be representative <strong>of</strong> <strong>the</strong> workload <strong>of</strong> an<br />

average user. However, it is more accurate to classify it as <strong>the</strong> workload <strong>of</strong> a typical s<strong>of</strong>tware developer.<br />

The benchmark consists <strong>of</strong> creating directories that mirror a source tree, copying files from <strong>the</strong> source tree,<br />

statting all files, reading all files, <strong>and</strong> building <strong>the</strong> source tree. Developed in 1987 to reflect a workload <strong>of</strong> five<br />

s<strong>of</strong>tware developers, with approximately 70 files totalling 200 kilobytes, this benchmark environment does<br />

not reflect current reality. Consequently, a few researchers have used modified versions <strong>of</strong> <strong>the</strong> benchmark<br />

with parameters scaled to <strong>the</strong> current state <strong>of</strong> technology [60, 14, 28, 68]. Despite modifications, <strong>the</strong> Andrew<br />

benchmark has o<strong>the</strong>r limitations. It does not truly stress <strong>the</strong> I/O subsystem since less than 25% <strong>of</strong> <strong>the</strong> time<br />

is spent performing I/O [15].<br />

The St<strong>and</strong>ard <strong>Performance</strong> Evaluation Corporation (SPEC) <strong>System</strong> <strong>File</strong> Server (SFS) 97 R1 V3.0 bench-<br />

mark [74] is popular among <strong>the</strong> computer industry. It measures <strong>the</strong> performance <strong>of</strong> <strong>the</strong> file system running as<br />

an NFS server. This benchmark is not suitable for our use since it introduces complications <strong>and</strong> interference<br />

from NFS protocol processing <strong>and</strong> UDP/IP network protocol traffic.<br />

Bonnie, written by Tim Bray in 1990, is a classic microbenchmark that performs sequential reads <strong>and</strong><br />

writes. It allows for comparison between read <strong>and</strong> write performance, block access versus character access,<br />

<strong>and</strong> r<strong>and</strong>om versus sequential access [15]. Bonnie was designed to reveal bottlenecks in <strong>the</strong> file system<br />

[77]. However, <strong>the</strong> range <strong>of</strong> tests appears to be fairly narrow since it only stresses read <strong>and</strong> write opera-<br />

tions <strong>of</strong> <strong>the</strong> file system <strong>and</strong> not operations such as file/directory creation/deletion, path name lookup, <strong>and</strong><br />

obtaining/modifying file attributes, making it unsuitable for our use.<br />

IOStone [54] simulates <strong>the</strong> locality found in <strong>the</strong> BSD file system workload study by Ousterhout et al.<br />

[53]. According to Tang [77], <strong>the</strong> workload does not scale well <strong>and</strong> is not I/O bound. Parallel file accesses<br />

do not occur since only one process is used. Chen <strong>and</strong> Patterson [15] found that IOStone spends less than<br />

25% <strong>of</strong> <strong>the</strong> time doing I/O, making it unsuitable for our use.<br />

SDET [21] simulates a time-sharing system used in a s<strong>of</strong>tware development environment. It simulates<br />

a s<strong>of</strong>tware developer at a terminal typing <strong>and</strong> executing shell comm<strong>and</strong>s. Each user, simulated by a shell

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!