Magellan Final Report - Office of Science - U.S. Department of Energy

More documents

Recommendations

Info

Magellan Final Report 100  Percentage Performance Rela/ve to Na/ve  90  80  70  60  50  40  30  20  10  TCP o IB  TCP o Eth  VM  0  GTC  PARATEC  CAM  Figure 9.4: Relative performance of each of the cases examined for the three applications: GTC, PARATEC and CAM. 100  100  Percentage of Computa0on Time Rela0ve to Na0ve  90  80  70  60  50  40  30  20  10  TCP o IB  TCP o Eth  VM  Percentage of Communcia0on Time rela0ve to Na0ve  90  80  70  60  50  40  30  20  10  TCP o IB  TCP o Eth  VM  0  GTC  PARATEC  CAM  0  GTC  PARATEC  CAM  Figure 9.5: Performance ratio for the compute (a) and communication (b) parts of the runtime for each of the three applications GTC, PARATEC and CAM for each of the cases under consideration. urations are shown in Figure 9.4. For all the codes, except CAM using TCP over Ethernet, the performance shows the expected trend, with each successive configuration showing decreased performance. To understand the reasons for the differing performance impact on each of the three codes, we analyzed them using the Integrated Performance Monitoring framework [77, 84]. The principal performance measurement made by IPM is the percentage of the runtime spent communicating (the time in the MPI library), which also allows us to infer the time spent computing. The results of this analysis are shown in Figure 9.5 which shows the compute time ratios for the various configurations. As expected from the HPCC results, only the VM configuration shows significant performance degradation for the compute portion of the runtime. Why the different applications show different slowdowns is unclear at this point, but we suspect it is related to the mix of instructions that they perform and the different slowdowns each experiences in the VM. The communication ratios are shown in Figure 9.5b. In this case the differences between the different applications and between the various configurations are much more pronounced. As well as providing overall information about the amount of time spent in the MPI library, IPM also reports the type of MPI routine called, the size of message sent, and the destination of the message. Using this information, we can understand why the different applications show different slowdowns. In the 54
Magellan Final Report case of GTC, most of the communication is bandwidth limited, and to nearest neighbors, hence the slowdown shows a similar pattern to that observed for the ping-pong bandwidth measurements. For PARATEC, most of communication is for messages that are in the latency limit, although not all, so simple mapping onto the HPCC communication patterns is not possible. For CAM, the communication pattern is closest to the random ring bandwidth measurement. Also for CAM, the master processor reads the input file and broadcasts its content using MPI, then writes the intermediate output files; the time in I/O shows up as MPI time. In section 9.3, we present some detailed I/O benchmarking that quantifies the overhead in I/O in virtual environments. 9.1.5 Scaling Study and Interconnect Performance This work was accepted for publication in the Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems Workshop (PMBS11) held in conjunction with SC11, Seattle, November 2011 [70] Our benchmarks and other studies [62, 23, 66, 74, 46] highlight the performance impact in virtual environments. These studies show that tightly coupled applications perform poorly in virtualized environments such as Amazon EC2. However, there is limited understanding about the type of cloud environments (e.g., interconnects, machine configuration, etc.) that might benefit science applications. HPC resource providers are interested in understanding the performance trade-offs of using cloud platform solutions such as virtualization in supercomputing centers. Today’s cloud environments are typically based on commodity hardware and use TCP over Ethernet to connect the nodes. However, compute resources found at HPC centers typically have much higher performing interconnects, such as InfiniBand, that provide higher bandwidth and lower latency. Thus the network performance of current cloud environments is quite different from that at HPC centers. In this study we use the Magellan cloud testbed to compare and understand the performance impact of the fabrics (InfiniBand, 10G) and protocols (InfiniBand, TCP, virtualization) between an HPC resource and a typical cloud resource. Specifically, we consider the following: native performance using InfiniBand, TCP over InfiniBand, TCP over 10G Ethernet, TCP over 10G Ethernet with virtualization, and TCP over 1G Ethernet. Additionally, we compare the performance of Amazon Cluster Compute instance types, the specialized HPC offering over 10G Ethernet. In this study, we address the question of what virtualized cloud environments must provide in order to be beneficial for HPC applications. We use standard benchmarks such as the HPC Challenge [41] to understand the overheads. We also select a subset of benchmarks from the NERSC-6 benchmark suite based on our earlier work to capture the application behavior in virtualized environments. Method Users typically use high performance supercomputing centers for their medium to large-sized runs. High performance supercomputing centers provide high performance networking and storage infrastructure, system administration, and user support, in addition to the compute servers, which are beneficial to the users. However, in these environments users have less flexibility and no explicit control of the software stack. The recent emergence of cloud computing as a potential platform for scientific computing has resulted in the need to revisit scientific computing environments and consider the performance that is possible in these environments. Previous work has shown that virtualized cloud environments impact the performance of tightly coupled applications. However, studies conducted on Amazon EC2 provide limited understanding of the causes of the performance decrease due to the black box nature of these cloud services. Thus, we use the Magellan testbed to understand the impact of performance on applications with different networking protocols and fabrics. InfiniBand interconnects are common in supercomputing centers due to their performance and scalability. The InfiniBand transport protocol provides a richer set of capabilities than TCP by leveraging the 55
Page 1 and 2:
The Magellan Report on Cloud Comput
Page 3 and 4:
Executive Summary The goal of Magel
Page 5 and 6:
Key Findings The goal of the Magell
Page 7 and 8:
Magellan Final Report Finding 8. DO
Page 9 and 10:
Magellan Final Report role in addre
Page 11 and 12:
Contents Executive Summary Key Find
Page 13 and 14:
Magellan Final Report 9.7 Discussio
Page 15 and 16:
Chapter 1 Overview Cloud computing
Page 17 and 18: Magellan Final Report • The Argon
Page 19 and 20: Chapter 2 Background The term “cl
Page 21 and 22: Magellan Final Report 2.1.4 Hardwar
Page 23 and 24: Magellan Final Report Table 3.1: Ke
Page 25 and 26: Magellan Final Report Little Magell
Page 27 and 28: Magellan Final Report 3.2 Advanced
Page 29 and 30: Chapter 4 Application Characteristi
Page 31 and 32: Magellan Final Report Table 4.1: Pe
Page 33 and 34: Magellan Final Report Output data
Page 35 and 36: Magellan Final Report of the pipeli
Page 37 and 38: Chapter 5 Magellan Testbed As part
Page 39 and 40: Magellan Final Report Figure 5.1: P
Page 41 and 42: Magellan Final Report Figure 5.2: P
Page 43 and 44: Magellan Final Report NERSC deploye
Page 45 and 46: Magellan Final Report Figure 6.1: A
Page 47 and 48: Magellan Final Report greater than
Page 49 and 50: Magellan Final Report specific QoS
Page 51 and 52: Magellan Final Report configuration
Page 53 and 54: Magellan Final Report 7.4 Summary U
Page 55 and 56: Magellan Final Report Firewalls are
Page 57 and 58: Magellan Final Report Aside from le
Page 59 and 60: Magellan Final Report 9.1 Understan
Page 61 and 62: Magellan Final Report grid) on 256
Page 63 and 64: Magellan Final Report Table 9.1: HP
Page 65 and 66: Magellan Final Report 25  Ping 
Page 67: Magellan Final Report 100  12 
Page 71 and 72: Magellan Final Report 1.4 IB TCPo
Page 73 and 74: Magellan Final Report only affects
Page 75 and 76: Magellan Final Report Figure 9.11:
Page 77 and 78: Magellan Final Report charted as a
Page 79 and 80: Magellan Final Report Evaluation Cr
Page 81 and 82: Magellan Final Report Write Perform
Page 83 and 84: Magellan Final Report 3500 3000 G
Page 85 and 86: Magellan Final Report Histogram Plo
Page 87 and 88: Magellan Final Report SATA devices.
Page 89 and 90: Magellan Final Report MB/s Virident
Page 91 and 92: Magellan Final Report and the perfo
Page 93 and 94: Magellan Final Report (a) Hosts (b)
Page 95 and 96: Magellan Final Report Routing IP pa
Page 97 and 98: Chapter 10 MapReduce Programming Mo
Page 99 and 100: Magellan Final Report 10.3 Hadoop E
Page 101 and 102: Magellan Final Report 35000  3500
Page 103 and 104: Magellan Final Report summarize som
Page 105 and 106: Magellan Final Report Processing ti
Page 107 and 108: Magellan Final Report in the networ
Page 109 and 110: Magellan Final Report Workload Patt
Page 111 and 112: Magellan Final Report This benchmar
Page 113 and 114: Magellan Final Report Task Tracker
Page 115 and 116: Magellan Final Report processing ti
Page 117 and 118: Magellan Final Report Using ESnet
Page 119 and 120:
Magellan Final Report Figure 11.2:
Page 121 and 122:
Magellan Final Report data collecte
Page 123 and 124:
Magellan Final Report comparison to
Page 125 and 126:
Magellan Final Report 11.2.5 Integr
Page 127 and 128:
Magellan Final Report very large (4
Page 129 and 130:
Magellan Final Report for optimizat
Page 131 and 132:
Magellan Final Report One of the ad
Page 133 and 134:
Magellan Final Report commercial cl
Page 135 and 136:
Magellan Final Report Table 12.2: H
Page 137 and 138:
Magellan Final Report Cost per TF t
Page 139 and 140:
Magellan Final Report Productivity.
Page 141 and 142:
Magellan Final Report compute insta
Page 143 and 144:
Chapter 13 Conclusions Cloud comput
Page 145 and 146:
Magellan Final Report Inherently, t
Page 147 and 148:
Bibliography [1] G. Aldering, G. Ad
Page 149 and 150:
Magellan Final Report [30] I. Foste
Page 151 and 152:
Magellan Final Report [67] M. Palan
Page 153 and 154:
Appendix A Publications Selected Pr
Page 155 and 156:
Magellan Final Report Magellan Rese
Page 157 and 158:
Magellan Final Report Selected Mage
Page 159 and 160:
Appendix B Surveys B1
Page 161 and 162:
• Nuclear Physics - Accelarator P
Page 163 and 164:
Allow users to edit responses. What
Page 165 and 166:
Amazon Eucalyptus OpenStack Other:
Page 167 and 168:
Please list any publications/report
Page 169 and 170:
Hadoop Streaming Hadoop Native Prog
show all

Magellan Final Report - Office of Science - U.S. Department of Energy

Create successful ePaper yourself

Delete template?

Save as template?