Magellan Final Report - Office of Science - U.S. Department of Energy

More documents

Recommendations

Info

Magellan Final Report the EBS corresponding volumes. This is especially true in the Amazon small instances that are bandwidth limited. At least during the duration of our tests, we noticed that the larger instance types showed lesser advantage with the local disks, possibly due to the increased network bandwidth available in the large instances. However, users need to do a performance-cost analysis of local disk vs EBS for their particular application [47]. Instance Type. The anticipated I/O performance on the instance type is expected to get better with the larger better instances. However, in our limited testing, we encountered situations where we were able to get better I/O performance on the small instance local disk than the large instance. Our tests also show that the small instances do tend to show a fair amount of variability, and hence more extensive testing might be needed to capture these behaviors over time. The EBS performance appeared to improve with more capable instance types, possibly due to the better network available to the larger and/or the CC instances. Availability Regions. We observed that the west zone performance was better than the performance of the east zone. The west zone VMs on Amazon have slightly higher price points, possibly resulting in better performance. However, our large-scale tests also show that the west zone has a higher standard deviation than the east zone. 9.4 Flash Benchmarking Solid-state storage (SSS) is poised as a disruptive technology. This impact would likely affect both the cloud computing and scientific computing spaces. For these reasons, flash storage evaluation was included in the Magellan project. Magellan at NERSC first evaluated several products before deploying a larger storage system based on that evaluation. The technologies that were evaluated include three PCIe connected solutions and two SATA connected solutions. Table 9.5 summarizes the products that were evaluated. Table 9.5: Summary of flash-based storage products that were evaluated. Manufacturer Product Capacity PCIe Attached Devices - All use SLC Flash Texas Memory Systems RamSAN 20 450 GB FusionIO ioDrive Duo 320 GB Virident tachIOn 400 GB SATA Attached Devices - Both use MLC Flash Intel X25-M 160 GB OCZ Colossus 250 GB NAND flash devices have dramatically different performance characteristics compared with traditional disk systems. NAND typically delivers much higher random read rates. However, NAND chips must be erased prior to writing new data. This erase cycle can be extremely slow. For example, NAND may require several milliseconds to erase a block, yet can perform a read of block in several microseconds [79]. To help mask this impact, many high-end devices utilize background grooming cycles to maintain a pool of erased blocks available for new writes. Early benchmarking efforts focused on measuring the bandwidth the devices can deliver. Plots in Figure 9.21 show the performance of the devices across a range of blocks sizes for both sequential read and write operations. As expected the PCIe attached devices outperformed the SATA attached devices. More interestingly, the solid-state devices are still sensitive to the block size of the IO operation. This is likely due to both overheads in the kernel IO stack, as well as additional transaction overheads in the devices. In general, the card devices provide more balanced performance for writes versus reads compared to the MLC based 72
Magellan Final Report SATA devices. This is likely due to both the better performance characteristics of the SLC Flash devices used in the PCI devices, as well as more sophisticated flash translation logic coupled. Additional benchmarking focused on identifying how the devices behaved under sustained heavy randomwrite load to address the impact of any grooming cycles. For some devices this impact can be quite high. For applications with very high-burst I/O followed by several minutes of idle I/O, this grooming cycle could be hidden from the application. However, few applications are likely to be so well behaved, and mixed workloads further complicate the picture. For the degradation experiments, a large block (128 KB) is randomly re-written over a fraction of the device. The measured bandwidth as a function of time is shown in Figure 9.22a. Typically within the first 15 minutes of the experiment, we see quite a bit of noise, which is most likely due to the flash controller switching algorithms as the device transitions from having spare blocks into the process of actually utilizing these blocks. Within about 30 minutes, all of the devices have reached a steady state often with a drastic decline in random-write bandwidth. We also compared the steady state bandwidth as a fraction of peak for each device achieved as a function of fullness which is shown in Figure 9.22b. For the SATA drives the performance degradation is significant, although it shows no variation with time or fullness, typically 5-10% of the peak is observed right from the beginning of the experiment. For the PCI cards, the performance degradation is significant. In this benchmark the Virident tacIOn card is the best performing. It shows the lowest deviation from peak with 30-70% fullness, and is equal to the TMS Ramsan at 90% fullness. The FusionIO ioDrive card performs almost identically to the TMS Ramsam one for 30% and 50% but for 70% and 90% fullness is significantly worse, where it only achieves 15% of its peak bandwidth with 90% fullness. Additional studies were underway at the conclusion of the project. These efforts focused on understanding application performance on the devices. Applications included databases, out-of-core applications, and applications that perform checkpoint operations. In general, these studies demonstrated that while the underlying flash devices are capable of delivering high bandwidth and IO operations-per-second, the impact to the applications can often be modest. While one expects the superior performance of flash, particularly for random read operations, to lead to improved application performance, experience shows that this is not always the case. Often code paths in the kernel or applications have not been sufficiently optimized to take advantage of the performance the new devices can deliver. In some cases, this is due to decades worth of optimization to deal with the distinct performance characteristics of disk. Swap algorithms and I/O schedulers will often buffer and flush data to minimize the number of writes. However, these optimizations can increase the latency and prevent the flash devices from maintaining a sustained stream of I/O operations. Additionally, many of the devices benefit from a high number of threads issuing write operations. This prevents the buffer queues from being exhausted too quickly. For disk systems, this can have an adverse effect of creating extra head movement, which slows down throughput. The difference in the performance characteristics between disk and solid state storage make it clear that applications need to be evaluated to see what near-term benefit SSS can provide. Eventually, improvements in the algorithms in the kernel, middleware, and applications may be required to realize the full potential. Despite these results, there are many cases were flash is being used effectively. Most of these involve read-intensive random IO. A common example is to accelerate databases that are primarily performing lookup operations. The results of our benchmarking efforts of flash devices illustrate the importance of the interface used to access the storage (PCE versus SATA) and the flash translation logic used in the devices. The study also found that applications may not always realize all of the benefits of the devices due to other limitations and bottlenecks. Future changes in the technologies used in solid-state storage devices will likely effect some of these characteristics. For example, next-generation technologies like Phase Change Memory and Memristor are less sensitive to wearing and do not exhibit the same erasure overheads of NAND flash. However, even with this potential improvements, changes are required in the IO software stack and other areas in order to achieve the full potential of these devices. 73
Page 1 and 2:
The Magellan Report on Cloud Comput
Page 3 and 4:
Executive Summary The goal of Magel
Page 5 and 6:
Key Findings The goal of the Magell
Page 7 and 8:
Magellan Final Report Finding 8. DO
Page 9 and 10:
Magellan Final Report role in addre
Page 11 and 12:
Contents Executive Summary Key Find
Page 13 and 14:
Magellan Final Report 9.7 Discussio
Page 15 and 16:
Chapter 1 Overview Cloud computing
Page 17 and 18:
Magellan Final Report • The Argon
Page 19 and 20:
Chapter 2 Background The term “cl
Page 21 and 22:
Magellan Final Report 2.1.4 Hardwar
Page 23 and 24:
Magellan Final Report Table 3.1: Ke
Page 25 and 26:
Magellan Final Report Little Magell
Page 27 and 28:
Magellan Final Report 3.2 Advanced
Page 29 and 30:
Chapter 4 Application Characteristi
Page 31 and 32:
Magellan Final Report Table 4.1: Pe
Page 33 and 34:
Magellan Final Report Output data
Page 35 and 36: Magellan Final Report of the pipeli
Page 37 and 38: Chapter 5 Magellan Testbed As part
Page 39 and 40: Magellan Final Report Figure 5.1: P
Page 41 and 42: Magellan Final Report Figure 5.2: P
Page 43 and 44: Magellan Final Report NERSC deploye
Page 45 and 46: Magellan Final Report Figure 6.1: A
Page 47 and 48: Magellan Final Report greater than
Page 49 and 50: Magellan Final Report specific QoS
Page 51 and 52: Magellan Final Report configuration
Page 53 and 54: Magellan Final Report 7.4 Summary U
Page 55 and 56: Magellan Final Report Firewalls are
Page 57 and 58: Magellan Final Report Aside from le
Page 59 and 60: Magellan Final Report 9.1 Understan
Page 61 and 62: Magellan Final Report grid) on 256
Page 63 and 64: Magellan Final Report Table 9.1: HP
Page 65 and 66: Magellan Final Report 25  Ping 
Page 67 and 68: Magellan Final Report 100  12 
Page 69 and 70: Magellan Final Report case of GTC,
Page 71 and 72: Magellan Final Report 1.4 IB TCPo
Page 73 and 74: Magellan Final Report only affects
Page 75 and 76: Magellan Final Report Figure 9.11:
Page 77 and 78: Magellan Final Report charted as a
Page 79 and 80: Magellan Final Report Evaluation Cr
Page 81 and 82: Magellan Final Report Write Perform
Page 83 and 84: Magellan Final Report 3500 3000 G
Page 85: Magellan Final Report Histogram Plo
Page 89 and 90: Magellan Final Report MB/s Virident
Page 91 and 92: Magellan Final Report and the perfo
Page 93 and 94: Magellan Final Report (a) Hosts (b)
Page 95 and 96: Magellan Final Report Routing IP pa
Page 97 and 98: Chapter 10 MapReduce Programming Mo
Page 99 and 100: Magellan Final Report 10.3 Hadoop E
Page 101 and 102: Magellan Final Report 35000  3500
Page 103 and 104: Magellan Final Report summarize som
Page 105 and 106: Magellan Final Report Processing ti
Page 107 and 108: Magellan Final Report in the networ
Page 109 and 110: Magellan Final Report Workload Patt
Page 111 and 112: Magellan Final Report This benchmar
Page 113 and 114: Magellan Final Report Task Tracker
Page 115 and 116: Magellan Final Report processing ti
Page 117 and 118: Magellan Final Report Using ESnet
Page 119 and 120: Magellan Final Report Figure 11.2:
Page 121 and 122: Magellan Final Report data collecte
Page 123 and 124: Magellan Final Report comparison to
Page 125 and 126: Magellan Final Report 11.2.5 Integr
Page 127 and 128: Magellan Final Report very large (4
Page 129 and 130: Magellan Final Report for optimizat
Page 131 and 132: Magellan Final Report One of the ad
Page 133 and 134: Magellan Final Report commercial cl
Page 135 and 136: Magellan Final Report Table 12.2: H
Page 137 and 138:
Magellan Final Report Cost per TF t
Page 139 and 140:
Magellan Final Report Productivity.
Page 141 and 142:
Magellan Final Report compute insta
Page 143 and 144:
Chapter 13 Conclusions Cloud comput
Page 145 and 146:
Magellan Final Report Inherently, t
Page 147 and 148:
Bibliography [1] G. Aldering, G. Ad
Page 149 and 150:
Magellan Final Report [30] I. Foste
Page 151 and 152:
Magellan Final Report [67] M. Palan
Page 153 and 154:
Appendix A Publications Selected Pr
Page 155 and 156:
Magellan Final Report Magellan Rese
Page 157 and 158:
Magellan Final Report Selected Mage
Page 159 and 160:
Appendix B Surveys B1
Page 161 and 162:
• Nuclear Physics - Accelarator P
Page 163 and 164:
Allow users to edit responses. What
Page 165 and 166:
Amazon Eucalyptus OpenStack Other:
Page 167 and 168:
Please list any publications/report
Page 169 and 170:
Hadoop Streaming Hadoop Native Prog
show all

Magellan Final Report - Office of Science - U.S. Department of Energy

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?