28.07.2013 Views

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

Performance Analysis and Optimization of the Hurricane File System ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 5. MICROBENCHMARKS – READ OPERATION 49<br />

is implemented as an array, containing a fixed number <strong>of</strong> elements. If all elements <strong>of</strong> an array are occupied<br />

when a new element is needed from <strong>the</strong> array, a victim element must be picked for recycling <strong>and</strong> reuse.<br />

A free list is associated with each hash list to track free elements in <strong>the</strong> hash list. Ra<strong>the</strong>r than having a<br />

lock for each free list <strong>and</strong> hash list, each pair <strong>of</strong> hash <strong>and</strong> free list is protected by one lock. Array indices<br />

ra<strong>the</strong>r than memory pointers are used because elements do not migrate between hash lists or free lists. This<br />

lack <strong>of</strong> migration combined with <strong>the</strong> fixed <strong>and</strong> equal number <strong>of</strong> elements on each hash list partially solves<br />

<strong>the</strong> balancing issue, since <strong>the</strong> maximum potential length <strong>of</strong> a free list is equal to <strong>the</strong> maximum size <strong>of</strong> an<br />

array. Free lists cannot become too unbalanced, only slightly unbalanced. An important parameter value<br />

to determine is <strong>the</strong> size <strong>of</strong> <strong>the</strong> hash list arrays, since <strong>the</strong> number <strong>of</strong> elements is fixed. The static nature <strong>of</strong><br />

<strong>the</strong> SSAC organization makes this solution fairly rigid <strong>and</strong> limits <strong>the</strong> ability to h<strong>and</strong>le load imbalances. For<br />

example, heavily used hash lists cannot exploit free elements located in hash lists that are mostly idle.<br />

Ra<strong>the</strong>r than implementing a particular solution <strong>and</strong> radically modifying <strong>the</strong> block cache system once<br />

more, we decided to focus our efforts on exploring <strong>the</strong> remaining scalability problem shown in Figure 5.28.<br />

Implementing a free list solution is a subject <strong>of</strong> future work.<br />

5.5.8 Global Lock, Larger Rehashed Table, No Free List<br />

This experiment examines <strong>the</strong> relevance <strong>of</strong> <strong>the</strong> global lock once <strong>the</strong> free list bottleneck is eliminated. The<br />

configuration was <strong>the</strong> same as in Section 5.5.7 except that <strong>the</strong> single global lock was used ra<strong>the</strong>r than <strong>the</strong><br />

fine grain locks in <strong>the</strong> block cache system. The results are shown in Figure 5.30. <strong>Performance</strong> was similar<br />

to <strong>the</strong> results in Section 5.5.3 (where a modified hash function <strong>and</strong> larger hash table were used) except that<br />

performance was better on 4 processors. Since <strong>the</strong> free list bottleneck was eliminated in <strong>the</strong> block cache<br />

system, <strong>the</strong> next performance encountered was <strong>the</strong> single global lock. The global lock was effective for up<br />

to 4 processors but became <strong>the</strong> critical bottle bottleneck beyond this point despite <strong>the</strong> distributed usage <strong>of</strong><br />

64 hash lists. Fine grain locks were indeed required to eliminate <strong>the</strong> bottleneck.<br />

There is a potential concern with <strong>the</strong> number <strong>of</strong> locks that must be acquired in <strong>the</strong> fine grain lock<br />

implementation. The overhead <strong>of</strong> acquiring locks, even if <strong>the</strong>y are completely uncontended, may accumulate<br />

to a significant amount <strong>of</strong> time. The results from this experiment demonstrate that <strong>the</strong> benefits <strong>of</strong> fine grain<br />

locks in <strong>the</strong> file system far out-weigh <strong>the</strong> costs <strong>of</strong> acquiring approximately 3 additional locks.<br />

5.5.9 All <strong>Optimization</strong>s Combined<br />

All optimizations were applied in this experiment to determine whe<strong>the</strong>r ideal performance <strong>and</strong> scalability<br />

could be achieved. These optimizations included <strong>the</strong> following.<br />

1. Padded ORS hash list headers.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!