Performance Analysis and Optimization of the Hurricane File System ...

More documents

Recommendations

Info

CHAPTER 6. MICROBENCHMARKS – OTHER FUNDAMENTAL OPERATIONS 57 avg # cycles per processor 4e+09 3e+09 2e+09 1e+09 0e+00 2 4 6 8 10 12 # processors Unoptimized 200 pool Optimized Figure 6.10: Lookup – all optimizations. avg # cycles per processor 5e+07 4e+07 3e+07 2e+07 1e+07 0e+00 2 4 6 8 10 12 # processors Optimized Figure 6.11: Lookup – all optimizations, magnified. 3. Padded all other critical data structures in the ORS and block cache systems, such as locks and waiting I/O data structures (shown in Figure 3.4 and Figure 3.5). 4. Modified hash functions. (ORS and block cache) 5. Increased size of hash tables. (ORS and block cache) 6. Increased initial pool of block cache entries from 20 to 200. 7. Elimination of the block cache free list. 8. Use of fine grain locks in the block cache. For simplicity, the K42 padding facilities were used rather than manually padding and aligning the data structures. The results, shown in Figure 6.10, indicate good scalability. A magnification of the graph, shown in Figure 6.11, indicates that scalability was good but not ideal. The 8 processor results exhibited the same anomaly as described in Section 5.5.7, resulting in the lower error bar in the graph. The optimizations that applied to the read test workload were effective on the file creation and file lookup workloads. The optimizations of padding, modifying the hash functions, increasing the size of hash tables, eliminating the global free list, and using fine grain locks improved performance because the interactions with the cache systems are based on the same basic operations. These operations include hashing, hash list traversal, cache entry access and modification, dirty list enqueuing, and cache entry freeing. 6.3.3 Summary Problems with the block cache entry allocation algorithm were evident with this simple microbenchmark workload. Increasing the size of the initial pool of block cache entries helped to improve scalability. For instance, on 12 processors, threads executed in approximately half the time as the unoptimized version. The optimizations implemented from the read file experiments were applied to this workload and dramat- ically improved scalability. Performance trends show that thread execution times appear to stabilize from 8 processors onwards.
CHAPTER 6. MICROBENCHMARKS – OTHER FUNDAMENTAL OPERATIONS 58 avg # cycles per processor 4e+07 3e+07 2e+07 1e+07 0e+00 2 4 6 8 10 12 # processors Original read test Original read test: perfect memory Stat Figure 6.12: Stat. 6.4 File Stat avg # cycles per processor 1e+08 5e+07 0e+00 2 4 6 8 10 12 # processors Stat + processing Stat Figure 6.13: Stat – +processing. avg # cycles per processor 2e+07 1.5e+7 1e+07 5e+06 0e+00 2 4 6 8 10 12 # processors Stat + processing Stat Figure 6.14: Stat – +processing. We examined the scalability of the getAttribute() operation in the next set of experiments. As described in Section 3.6.1 on page 24, getAttribute() returns the attributes of a target file or directory identified by a file token. This experiment was basically a subset of the unoptimized extent-based read test in Section 5.2 since, in the file stat experiment, the file status operation is executed but file block reading and file status updating is not performed. Each thread executed on its own processor, disk, and file. For each thread, the getAttribute() file system operation was performed 3136 times on its own extent-based file, matching the number of times the operation was performed on the superset of the experiment (Section 5.2), and allowing for easy comparison of results. The results of using unoptimized HFS, shown in Figure 6.12 1 , are similar to the extent-based read test (Section 5.3), since the experiment was simply a subset of the read test. 6.4.1 + Processing This experimental configuration was a slightly more realistic file stat test. The configuration is similar to Section 6.4, except that it processes the file status data into the standard Unix format before returning from the call. More specifically, it converts the HFS format of file status to the more widely used Unix/Linux/GNU file status format. As shown in Figure 6.13 2 and Figure 6.14 3 , the results were fairly similar to the previous file stat experiment in Section 6.4 as might be expected. It took slightly longer for the extra processing, however the maximum error bar range on 12 processors was much greater. 1 Error bars were removed from the unoptimized 12 processor configuration. 2 Error bars were removed from the unprocessed 12 processor configuration. 3 Error bars were removed from the 12 processor configurations.
Page 1 and 2:
Performance Analysis and Optimizati
Page 3 and 4:
Acknowledgements This thesis has be
Page 5 and 6:
4.6 Measurements Taken and Graph In
Page 7 and 8:
List of Tables 3.1 File system inte
Page 9 and 10:
6.1 Create. . . . . . . . . . . . .
Page 11 and 12:
CHAPTER 1. INTRODUCTION AND MOTIVAT
Page 13 and 14:
CHAPTER 1. INTRODUCTION AND MOTIVAT
Page 15 and 16: CHAPTER 2. BACKGROUND AND RELATED W
Page 23 and 24: Chapter 3 HFS Architecture This cha
Page 25 and 26: CHAPTER 3. HFS ARCHITECTURE 16 3.1.
Page 27 and 28: CHAPTER 3. HFS ARCHITECTURE 18 ORS
Page 29 and 30: CHAPTER 3. HFS ARCHITECTURE 20 dirt
Page 31 and 32: CHAPTER 3. HFS ARCHITECTURE 22 dire
Page 33 and 34: CHAPTER 3. HFS ARCHITECTURE 24 1. 2
Page 35 and 36: CHAPTER 3. HFS ARCHITECTURE 26 curr
Page 37 and 38: CHAPTER 4. EXPERIMENTAL SETUP 28 1
Page 39 and 40: CHAPTER 4. EXPERIMENTAL SETUP 30 de
Page 41 and 42: CHAPTER 4. EXPERIMENTAL SETUP 32 av
Page 43 and 44: Chapter 5 Microbenchmarks - Read Op
Page 45 and 46: CHAPTER 5. MICROBENCHMARKS - READ O
Page 61 and 62: Chapter 6 Microbenchmarks - Other F
Page 63 and 64: CHAPTER 6. MICROBENCHMARKS - OTHER
Page 65: CHAPTER 6. MICROBENCHMARKS - OTHER
Page 75 and 76: Chapter 7 Macrobenchmark 7.1 Purpos
Page 77 and 78: CHAPTER 7. MACROBENCHMARK 68 server
Page 79 and 80: CHAPTER 7. MACROBENCHMARK 70 avg #
Page 81 and 82: CHAPTER 7. MACROBENCHMARK 72 Logica
Page 87 and 88: CHAPTER 7. MACROBENCHMARK 78 7.5 Ot
Page 89 and 90: CHAPTER 7. MACROBENCHMARK 80 to deq
Page 91 and 92: CHAPTER 8. CONCLUSIONS 82 8.1 Gener
Page 93 and 94: CHAPTER 8. CONCLUSIONS 84 read-only
Page 95 and 96: Bibliography [1] Gene M. Amdahl. Va
Page 97 and 98: BIBLIOGRAPHY 88 [33] David Kotz, So
Page 99: BIBLIOGRAPHY 90 [71] Keith A. Smith
show all

Performance Analysis and Optimization of the Hurricane File System ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?