28.07.2013 Views

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 4. EXPERIMENTAL SETUP 31<br />

4.5 <strong>System</strong> Environment Configuration<br />

To create a simple, controlled test environment, <strong>the</strong> experiments were mainly executed in <strong>the</strong> file system<br />

address space (server-level) <strong>and</strong> no o<strong>the</strong>r major servers were running at <strong>the</strong> time. For instance, <strong>the</strong> NFS<br />

server, RAM file system, toy file system, device file system, pipe server, <strong>and</strong> pty server were all disabled. For<br />

each microbenchmark, <strong>the</strong> computer system was booted, HFS was initialized, <strong>and</strong> <strong>the</strong> appropriate benchmark<br />

was executed. Running at <strong>the</strong> server-level allowed <strong>the</strong> file system to be stressed without invoking <strong>the</strong> VFS<br />

<strong>and</strong> FCM components <strong>of</strong> <strong>the</strong> operating system.<br />

For <strong>the</strong> few user-level experiments, <strong>the</strong> services <strong>of</strong> <strong>the</strong> pipe server <strong>and</strong> pty server were required <strong>and</strong><br />

enabled. When running <strong>the</strong> experiments at user-level, <strong>the</strong> K42 file cache manager cached all file data blocks<br />

<strong>and</strong> all requests for file data blocks were satisfied from <strong>the</strong> file cache from <strong>the</strong> second run <strong>of</strong> <strong>the</strong> experiment<br />

onward 2 . Also, <strong>the</strong> VFS component cached portions <strong>of</strong> file meta-data such as file attributes. Therefore, <strong>the</strong><br />

user-level experiments stressed <strong>the</strong> FCM <strong>and</strong> VFS components <strong>of</strong> <strong>the</strong> operating system.<br />

The version <strong>of</strong> K42 used did not have FCM page eviction (i.e. page-out) enabled because <strong>of</strong> problems in<br />

<strong>the</strong> implementation at <strong>the</strong> time <strong>the</strong> experiments were run. The workloads used in <strong>the</strong> experiments are not<br />

affected by this missing feature.<br />

4.6 Measurements Taken <strong>and</strong> Graph Interpretation<br />

We measured scalability for 1, 2, 4, 8, <strong>and</strong> 12 processor configurations. For <strong>the</strong> microbenchmarks, each<br />

experiment was repeated 10 times under each processor configuration. Each repetition consisted <strong>of</strong> an initial<br />

warm up run, followed by a subsequent identical measured run. For each thread, we measured <strong>the</strong> number<br />

<strong>of</strong> cycles taken to complete its task. For a measured run <strong>of</strong> <strong>the</strong> experiment on a uniprocessor, we obtain one<br />

number from <strong>the</strong> single thread <strong>of</strong> execution. On a dual processor, we obtain two numbers, one from each <strong>of</strong><br />

<strong>the</strong> two threads <strong>of</strong> execution. On a four processor computer, we obtain four numbers from each <strong>of</strong> <strong>the</strong> four<br />

threads <strong>of</strong> execution.<br />

After 10 measured runs, we have 10, 20, 40, 80, <strong>and</strong> 120 values for processor configurations <strong>of</strong> 1, 2, 4, 8,<br />

<strong>and</strong> 12, respectively. We plot <strong>the</strong> average <strong>of</strong> <strong>the</strong>se corresponding numbers to obtain a curve. We also note<br />

<strong>the</strong> minimum <strong>and</strong> maximum cycle counts for each configuration.<br />

The y-axis indicates <strong>the</strong> average number <strong>of</strong> cycles required by each thread while <strong>the</strong> x-axis indicates <strong>the</strong><br />

number <strong>of</strong> processors. The ideal scalability graph would be a flat, perfectly horizontal curve, indicating that<br />

thread execution time is independent <strong>of</strong> system size.<br />

Due to time constraints <strong>and</strong> speed limitations <strong>of</strong> <strong>the</strong> SimOS simulator, <strong>the</strong> macrobenchmark experiments<br />

were repeated only twice.<br />

2 Main memory is sized adequately to allow all file data to be cached.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!