29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

9.1 Understanding the Impact <strong>of</strong> Virtualization and Interconnect<br />

Performance<br />

The benchmarking was conducted in three phases. In Phase 1, we compared the performance <strong>of</strong> a commercial<br />

cloud platform (Amazon EC2) with NERSC systems (Carver and Franklin) and a mid-range computing<br />

system (Lawrencium) at LBNL. The Carver and <strong>Magellan</strong> systems are identical in hardware configuration,<br />

and the traditional cluster configuration for <strong>Magellan</strong> uses the same s<strong>of</strong>tware configuration as Carver. The<br />

typical problem configurations for the NERSC-6 benchmarks (described below) are defined for much larger<br />

systems. We constructed reduced problem size configurations to target the requirements <strong>of</strong> mid-range workloads.<br />

In Phase 2, we repeated the experiments <strong>of</strong> the reduced problem size configurations on <strong>Magellan</strong><br />

hardware to further understand and characterize the virtualization impact. <strong>Final</strong>ly, in Phase 3, we studied<br />

the effect <strong>of</strong> scaling and interconnects on the <strong>Magellan</strong> testbed and Amazon.<br />

9.1.1 Applications Used in Study<br />

DOE supercomputer centers typically serve a diverse user community. In NERSC’s case, the community<br />

contains over 4,000 users working on more than 400 distinct projects and using some 600 codes that serve the<br />

diverse science needs <strong>of</strong> the DOE <strong>Office</strong> <strong>of</strong> <strong>Science</strong> research community. Abstracting the salient performancecritical<br />

features <strong>of</strong> such a workload is a challenging task. However, significant workload characterization<br />

efforts have resulted in a set <strong>of</strong> full application benchmarks that span a range <strong>of</strong> science domains, algorithmic<br />

methods, and concurrencies, as well as machine-based characteristics that influence performance such as<br />

message size, memory access pattern, and working set sizes. These applications form the basis for the<br />

Sustained System Performance (SSP) metric, which represents the effectiveness <strong>of</strong> a system for delivered<br />

performance on applications better than peak FLOP rates [55].<br />

As well as being representative <strong>of</strong> the DOE <strong>Office</strong> <strong>of</strong> <strong>Science</strong> workload, some <strong>of</strong> these applications have<br />

also been used by other federal agencies, as they represent significant parts <strong>of</strong> their workload. MILC and<br />

PARATEC were used by the NSF, CAM by NCAR, and GAMESS by the NSF and DoD HPCMO. The<br />

representation <strong>of</strong> the methods embodied in our benchmark suite goes well beyond the particular codes<br />

employed. For example, PARATEC is representative <strong>of</strong> methods that constitute one <strong>of</strong> the largest consumers<br />

<strong>of</strong> supercomputing cycles in computer centers around the world [64]. Therefore, although our benchmark<br />

suite was developed with the NERSC workload in mind, we are confident that it is broadly representative<br />

<strong>of</strong> the workloads <strong>of</strong> many supercomputing centers today. More details about these applications and their<br />

computational characteristics in the NERSC-6 benchmark suite can be found in Ref. [3].<br />

The typical problem configurations for these benchmarks are defined for much larger “capability” systems,<br />

so we had to construct reduced size problem configurations to target the requirements <strong>of</strong> mid-range<br />

workloads that are the primary subject <strong>of</strong> this study. For example, many <strong>of</strong> the input configurations were<br />

constructed for a system acquisition (begun during 2008) that resulted in a 1 petaflop peak resource that<br />

contains over 150,000 cores—considerably larger than is easily feasible to run in today’s commercial cloud<br />

infrastructures. Thus, problem sets were modified to use smaller grids and concomitant concurrencies along<br />

with shorter iteration spaces and/or shorter simulation durations. We also modified the problem configurations<br />

to eliminate any significant I/O in order to focus on the computational and network characteristics. For<br />

the scaling studies described in Section 9.1.5, we used the larger problem sets from MILC and PARATEC<br />

as described below in the application description.<br />

Next we describe each <strong>of</strong> the applications that make up our benchmarking suite, describe the parameters<br />

they were run with, and comment upon their computation and communication characteristics.<br />

CAM. The Community Atmosphere Model (CAM) is the atmospheric component <strong>of</strong> the Community Climate<br />

System Model (CCSM) developed at NCAR and elsewhere for the weather and climate research communities<br />

[7, 6]. In this work we use CAM v3.1 with a finite volume (FV) dynamical core and a “D” grid (about<br />

0.5 degree resolution). In this case we used 120 MPI tasks and ran for three days simulated time (144<br />

timesteps).<br />

45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!