Magellan Final Report - Office of Science - U.S. Department of Energy

More documents

Recommendations

Info

Magellan Final Report As message size increases, overhead related to latency and connection instantiation become less significant. The results shown in Figure 9.12 demonstrate that while the raw hardware cluster has neared its practical maximum bandwidth at 16 KB, the virtual cluster is still 50 kbps below its average at 2 MB. Of particular note, variability in the results on the virtual cluster was incredibly high; with a 2 MB message size, a maximum reported bandwidth value of 181.74 kbps and a minimum of 32.24 kbps were reported. The difference between clusters in zero-length message bandwidth was also of concern. While an average of 329,779 messages passed between ranks every second on the raw hardware cluster, the virtual cluster could only manage an average of 21,793 per second. This is a strong indictment of the virtual network model as it pertains to MPI workload performance. Figure 9.13: Phloem: Selected MPI method tests, average performance ratio (16 ranks, 8 ranks per node). Figure 9.13 shows the performance penalties associated with individual one-to-many and many-to-many MPI calls, as measured by the Phloem mpiBench Bcast utility, run on a 16-rank 2-node cluster to maintain relevance with regard to the above SQMR tests. These are the averages of the per-method virtual cluster over raw hardware cluster time-to-solution ratios, chosen because they remain remarkably consistent across varying message sizes. This demonstrates that, in aggregation, the high point-to-point latency illustrated by the SQMR tests will result in a consistently reproducible multiplier for general parallel coding methods. Figure 9.14: Phloem: Selected MPI method tests, average performance ratio across 1000 trials. Figure 9.14 shows the scaling characteristics of the virtual cluster as compared to raw hardware, again 62
Magellan Final Report charted as a ratio of virtual cluster time-to-completion over raw hardware time-to-solution. Both sets of trials used 8 ranks per node, with one using a total of 32 ranks, and another using 256. The most notable performance degradation relative to hardware occurs on an MPI barrier call, which falls off by an extra factor of five. 9.3 I/O Benchmarking This work appeared in the DataCloud Workshop held in conjunction with SC11, Seattle, November 2011 [34] and received the Best Paper award at the workshop Increasingly scientists are generating and analyzing large data sets to derive scientific insights. Cloud computing technologies have largely evolved to process and store large data volumes of web and log data. Our earlier results have shown that the communication-intensive applications do poorly in these environments. However, there is limited understanding of the I/O performance in virtualized cloud environments. Understanding the I/O performance is critical to understanding the performance of scientific applications in these environments. I/O is commonly used in scientific applications for storing output from simulations for later analysis; for implementing algorithms that process more data than can fit in system memory by paging data to and from disk; and for checkpointing to save the state of application in case of system failure. HPC systems are typically equipped with a parallel file system such as Lustre or GPFS that can stripe data across large numbers of spinning disks connected to a number of I/O servers to provide scalable bandwidth and capacity. These file systems also allow multiple clients to concurrently write to a common file while preserving consistency. On systems such as those at NERSC, and most DOE HPC centers, often there are two file systems available: local and global. Local file systems accessible on a single platform typically provide the best performance, whereas global file systems simplify data sharing between platforms. These file systems are tuned to deliver the high performance that is required by these scientific applications. Thus it is critical to understand the I/O performance that can be achieved on cloud platforms in order to understand the performance impact on scientific applications that are considering these platforms. In this study, we evaluate a public cloud platform and the Eucalyptus cloud platform available on the Magellan testbed. We benchmark three instance types: small, large, and Cluster Compute, the specialized HPC offering. The Magellan virtual machine testbed runs the Eucalyptus 2.0 cloud software stack on top of KVM and uses Virtio for disk access. We used IOR [44] benchmarks and a custom timed benchmark for analyzing the I/O performance on clouds. We compared the performance of different instance types, both local and block store and different availability regions on Amazon to understand the spectrum of I/O performance. 9.3.1 Method We evaluate the performance of I/O that impacts the overall performance of the applications running in virtual machines on the cloud. Previous work has shown that virtualized cloud environments impact the performance of tightly coupled applications. However, studies conducted on Amazon EC2 provide limited understanding of the causes of the performance decrease, due to the black box nature of these cloud services. The performance impact has been suspected to be from I/O and networking aspects of these virtualized resources. We measure the I/O performance on a range of Amazon resources and the Magellan testbed to understand the impact of various storage options on the performance of applications. We measure I/O performance using standard and custom benchmarks on the cloud platforms mentioned above over different dimensions. We use the IOR (Interleaved or Random) benchmark to compare the I/O performance across all platforms. In addition, we developed a timed I/O benchmark that records the I/O performance over a period of time at predetermined intervals to assess variability. We also measure the performance of various storage options on virtual machines, and on Amazon record the performance across 63
Page 1 and 2:
The Magellan Report on Cloud Comput
Page 3 and 4:
Executive Summary The goal of Magel
Page 5 and 6:
Key Findings The goal of the Magell
Page 7 and 8:
Magellan Final Report Finding 8. DO
Page 9 and 10:
Magellan Final Report role in addre
Page 11 and 12:
Contents Executive Summary Key Find
Page 13 and 14:
Magellan Final Report 9.7 Discussio
Page 15 and 16:
Chapter 1 Overview Cloud computing
Page 17 and 18:
Magellan Final Report • The Argon
Page 19 and 20:
Chapter 2 Background The term “cl
Page 21 and 22:
Magellan Final Report 2.1.4 Hardwar
Page 23 and 24:
Magellan Final Report Table 3.1: Ke
Page 25 and 26: Magellan Final Report Little Magell
Page 27 and 28: Magellan Final Report 3.2 Advanced
Page 29 and 30: Chapter 4 Application Characteristi
Page 31 and 32: Magellan Final Report Table 4.1: Pe
Page 33 and 34: Magellan Final Report Output data
Page 35 and 36: Magellan Final Report of the pipeli
Page 37 and 38: Chapter 5 Magellan Testbed As part
Page 39 and 40: Magellan Final Report Figure 5.1: P
Page 41 and 42: Magellan Final Report Figure 5.2: P
Page 43 and 44: Magellan Final Report NERSC deploye
Page 45 and 46: Magellan Final Report Figure 6.1: A
Page 47 and 48: Magellan Final Report greater than
Page 49 and 50: Magellan Final Report specific QoS
Page 51 and 52: Magellan Final Report configuration
Page 53 and 54: Magellan Final Report 7.4 Summary U
Page 55 and 56: Magellan Final Report Firewalls are
Page 57 and 58: Magellan Final Report Aside from le
Page 59 and 60: Magellan Final Report 9.1 Understan
Page 61 and 62: Magellan Final Report grid) on 256
Page 63 and 64: Magellan Final Report Table 9.1: HP
Page 65 and 66: Magellan Final Report 25  Ping 
Page 67 and 68: Magellan Final Report 100  12 
Page 69 and 70: Magellan Final Report case of GTC,
Page 71 and 72: Magellan Final Report 1.4 IB TCPo
Page 73 and 74: Magellan Final Report only affects
Page 75: Magellan Final Report Figure 9.11:
Page 79 and 80: Magellan Final Report Evaluation Cr
Page 81 and 82: Magellan Final Report Write Perform
Page 83 and 84: Magellan Final Report 3500 3000 G
Page 85 and 86: Magellan Final Report Histogram Plo
Page 87 and 88: Magellan Final Report SATA devices.
Page 89 and 90: Magellan Final Report MB/s Virident
Page 91 and 92: Magellan Final Report and the perfo
Page 93 and 94: Magellan Final Report (a) Hosts (b)
Page 95 and 96: Magellan Final Report Routing IP pa
Page 97 and 98: Chapter 10 MapReduce Programming Mo
Page 99 and 100: Magellan Final Report 10.3 Hadoop E
Page 101 and 102: Magellan Final Report 35000  3500
Page 103 and 104: Magellan Final Report summarize som
Page 105 and 106: Magellan Final Report Processing ti
Page 107 and 108: Magellan Final Report in the networ
Page 109 and 110: Magellan Final Report Workload Patt
Page 111 and 112: Magellan Final Report This benchmar
Page 113 and 114: Magellan Final Report Task Tracker
Page 115 and 116: Magellan Final Report processing ti
Page 117 and 118: Magellan Final Report Using ESnet
Page 119 and 120: Magellan Final Report Figure 11.2:
Page 121 and 122: Magellan Final Report data collecte
Page 123 and 124: Magellan Final Report comparison to
Page 125 and 126: Magellan Final Report 11.2.5 Integr
Page 127 and 128:
Magellan Final Report very large (4
Page 129 and 130:
Magellan Final Report for optimizat
Page 131 and 132:
Magellan Final Report One of the ad
Page 133 and 134:
Magellan Final Report commercial cl
Page 135 and 136:
Magellan Final Report Table 12.2: H
Page 137 and 138:
Magellan Final Report Cost per TF t
Page 139 and 140:
Magellan Final Report Productivity.
Page 141 and 142:
Magellan Final Report compute insta
Page 143 and 144:
Chapter 13 Conclusions Cloud comput
Page 145 and 146:
Magellan Final Report Inherently, t
Page 147 and 148:
Bibliography [1] G. Aldering, G. Ad
Page 149 and 150:
Magellan Final Report [30] I. Foste
Page 151 and 152:
Magellan Final Report [67] M. Palan
Page 153 and 154:
Appendix A Publications Selected Pr
Page 155 and 156:
Magellan Final Report Magellan Rese
Page 157 and 158:
Magellan Final Report Selected Mage
Page 159 and 160:
Appendix B Surveys B1
Page 161 and 162:
• Nuclear Physics - Accelarator P
Page 163 and 164:
Allow users to edit responses. What
Page 165 and 166:
Amazon Eucalyptus OpenStack Other:
Page 167 and 168:
Please list any publications/report
Page 169 and 170:
Hadoop Streaming Hadoop Native Prog
show all

Magellan Final Report - Office of Science - U.S. Department of Energy

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?