28.07.2013 Views

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 8. CONCLUSIONS 82<br />

8.1 General <strong>Optimization</strong> Techniques Applied<br />

In our optimization process, we routinely used general principles <strong>of</strong> maximizing locality <strong>and</strong> concurrency<br />

[22], including:<br />

1. Reduce sharing whenever possible.<br />

2. Minimize <strong>the</strong> use <strong>of</strong> global data structures <strong>and</strong> locks.<br />

3. Use <strong>of</strong> distributed data structures <strong>and</strong> locks whenever possible.<br />

4. Minimize hash collisions.<br />

5. Reduce lock hold times.<br />

6. Eliminate false sharing <strong>of</strong> secondary cache lines.<br />

The techniques listed inherently overlap since <strong>the</strong>y begin quite general <strong>and</strong> become more specific. Peacock<br />

et al. [56, p. 86] had insight into multiprocessor scalability, stating that, “This problem illustrates a general<br />

principle which needs to be followed for reasonable scalability <strong>of</strong> a multiprocessor kernel, namely that queue<br />

sizes should remain bounded by a constant as <strong>the</strong> system size is increased.” This principle was applied to <strong>the</strong><br />

hash lists <strong>of</strong> <strong>the</strong> ORS <strong>and</strong> block cache system. Ano<strong>the</strong>r important characteristic to maintain is that, in <strong>the</strong><br />

case where true sharing cannot be fundamentally avoided, performance should degrade gracefully (linearly<br />

with <strong>the</strong> degree <strong>of</strong> sharing).<br />

8.2 Lessons Learned<br />

A number <strong>of</strong> lessons were learned during <strong>the</strong> production <strong>of</strong> this <strong>the</strong>sis.<br />

1. Achieving scalability is a non-trivial task. While <strong>the</strong> optimizations <strong>the</strong>mselves are simple to implement,<br />

<strong>the</strong> complexity lies in identifying <strong>the</strong> scalability bottleneck points. A significant investment in time <strong>and</strong><br />

effort is required to underst<strong>and</strong> <strong>the</strong> dynamic behavior <strong>of</strong> a large system containing complex interactions<br />

between <strong>the</strong> operating system, file system, hardware, workload, <strong>and</strong> run-time environment.<br />

2. The user-level operating environment is quite different from <strong>the</strong> server-level environment, so different<br />

workloads may be needed for each environment to properly stress <strong>the</strong> system.<br />

The Web server workload provided adequate file system stress in <strong>the</strong> server-level configuration, whereas<br />

it was not sufficient for <strong>the</strong> user-level configuration. We resorted to measuring <strong>the</strong> warm up run <strong>of</strong> <strong>the</strong><br />

Web server workload to provide adequate stress on <strong>the</strong> file system.<br />

3. Amdahl’s law [1] <strong>of</strong> parallel computing <strong>and</strong> “divide & conquer” is helpful in determining <strong>the</strong> critical<br />

scalability bottleneck points. In very simple terms, Amdahl’s law states that given a program consisting<br />

<strong>of</strong> segments that can <strong>and</strong> cannot be optimized, <strong>the</strong> performance improvement is bounded by <strong>the</strong><br />

segments that cannot be optimized.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!