Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 5<br />
<strong>Magellan</strong> Testbed<br />
As part <strong>of</strong> the <strong>Magellan</strong> project, a dedicated, distributed testbed was deployed at Argonne and NERSC. The<br />
two sites architected, procured, and deployed their testbed components separately, although the resources<br />
were chosen to complement each other. Deploying a testbed (versus acquiring services on existing commercial<br />
cloud systems) provided the flexibility necessary to address the <strong>Magellan</strong> research questions. Specifically our<br />
hardware and s<strong>of</strong>tware were configured to cater to scientific application needs, which are different from the<br />
typical workloads that run on commercial cloud systems. For example, the ability to adjust aspects <strong>of</strong><br />
the system s<strong>of</strong>tware and hardware allowed the <strong>Magellan</strong> team to explore how these design points impact<br />
application performance and usability. In addition, the diverse user requirements for cloud computing,<br />
ranging from access to custom environments to the MapReduce programming model, led to a requirement<br />
for a dynamic and reconfigurable s<strong>of</strong>tware stack. Users had access to customized virtual machines through<br />
OpenStack (at Argonne only) and Eucalyptus (at both sites), along with a Hadoop installation that allowed<br />
users to evaluate the MapReduce programming model and the Hadoop Distributed File System. Both<br />
OpenStack and Eucalyptus provide an application programming interface (API) that is compatible with<br />
the Amazon EC2 API, enabling users to port between commercial providers and the private cloud. Access<br />
to a traditional batch cluster environment was also used at NERSC to establish baseline performance and<br />
to collect data on workload characteristics for typical mid-range science applications that were considered<br />
suitable for cloud computing. The hardware and s<strong>of</strong>tware deployed within the testbed are described in the<br />
following section.<br />
5.1 Hardware<br />
The <strong>Magellan</strong> testbed hardware was architected to facilitate exploring a variety <strong>of</strong> usage models and understanding<br />
the impact <strong>of</strong> various design choices. As a result, the testbed incorporated a diverse collection<br />
<strong>of</strong> hardware resources, including compute nodes, large-memory nodes, GPU servers, and various storage<br />
technologies. In the fall <strong>of</strong> 2011, <strong>Magellan</strong> will be connected to the 100 Gb network planned for deployment<br />
by the DOE-SC-funded Advanced Networking Initiative (ANI). A portion <strong>of</strong> the <strong>Magellan</strong> resources will<br />
remain available after the cloud research is completed in order to support the ANI research projects.<br />
Both Argonne and NERSC deployed compute clusters based on IBM’s iDataplex solution. This solution<br />
is targeted towards large-scale deployments and emphasizes energy efficiency, density, and serviceability.<br />
Configurations for the iDataplex systems are similar at both sites. Each compute node has dual 2.66 GHz<br />
Intel Quad-core Nehalem processors, with 24 GB <strong>of</strong> memory, a local SATA drive, 40 Gb Infiniband (4X<br />
QDR), and 1 Gb Ethernet with IPMI. The system provides a high-performance InfiniBand network, which<br />
is <strong>of</strong>ten used in HPC-oriented clusters but is not yet common in mainstream commercial cloud systems.<br />
Since the network has such a large influence on the performance <strong>of</strong> many HPC and mid-range applications,<br />
the ability to explore the range <strong>of</strong> networking options from native InfiniBand to virtualized Ethernet was an<br />
23