28.07.2013 Views

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 2. BACKGROUND AND RELATED WORK 13<br />

script, has an exclusive home directory <strong>and</strong> executes shell comm<strong>and</strong>s independent <strong>of</strong> o<strong>the</strong>r users. Throughput<br />

is measured in terms <strong>of</strong> scripts per hour. Although it can be used to measure file system scalability, it is<br />

not specifically a file system benchmark but more <strong>of</strong> a general system scalability benchmark. Chen <strong>and</strong><br />

Patterson [15] have found that SDET spends less than 25% <strong>of</strong> <strong>the</strong> time doing I/O, making it unsuitable for<br />

our use.<br />

PostMark [30] emphasizes access to many small, short-lived files that all need equally fast access. In<br />

particular, it simulates a mail or network news server workload. It uses only 4 simple operations: (1) create<br />

file, (2) delete file, (3) read entire file, <strong>and</strong> (4) write to end <strong>of</strong> file. A specified number <strong>of</strong> r<strong>and</strong>om operations<br />

are executed <strong>and</strong> statistics are ga<strong>the</strong>red. PostMark uses a single working directory <strong>and</strong> does not create <strong>and</strong><br />

exercise a directory hierarchy.<br />

The applicability <strong>of</strong> benchmark performance to real-world workload performance is a much debated issue.<br />

Some researchers claim that previous file system benchmarks are inadequate in reflecting real-world workloads<br />

<strong>and</strong> contain various inadequacies [15, 16, 72, 69]. Chen <strong>and</strong> Patterson [15, 16] developed a technique <strong>of</strong><br />

measuring a few basic file system operations <strong>and</strong> projecting performance based on <strong>the</strong> characteristics <strong>of</strong><br />

<strong>the</strong> proposed real-world workload. They also address <strong>the</strong> problem <strong>of</strong> scaling a benchmark appropriately<br />

to suit <strong>the</strong> target platform. Smith [72] developed a benchmarking methodology that predicts file system<br />

performance for a specific workload <strong>and</strong> aids in bottleneck identification. Smith <strong>and</strong> Seltzer [71] advocate<br />

<strong>the</strong> need to age a file system before taking measurements in order to provide a more realistic file system<br />

state. They use a simulated workload to age <strong>the</strong> file system.<br />

In summary, many researchers are not satisfied with <strong>the</strong> currently available file system benchmarks. Some<br />

are out-dated <strong>and</strong> no longer applicable. Some do not stress <strong>the</strong> I/O system adequately because <strong>the</strong>y are not<br />

file system bound.<br />

In this <strong>the</strong>sis, we use custom benchmarks that are designed to stress specific components <strong>of</strong> <strong>the</strong> file<br />

system. A custom macrobenchmark is used for <strong>the</strong> sole purpose <strong>of</strong> verifying that <strong>the</strong> custom microbenchmark<br />

performance results were applicable at some level <strong>of</strong> generality. We will show that improving <strong>the</strong> performance<br />

scalability <strong>of</strong> fundamental file system operations leads to scalability <strong>of</strong> <strong>the</strong> file system in general.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!