Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 11<br />
Application Experiences<br />
A diverse set <strong>of</strong> scientific applications have used the <strong>Magellan</strong> cloud testbed, resulting in significant scientific<br />
discoveries while helping to evaluate the use <strong>of</strong> cloud computing for science. Early adopters <strong>of</strong> cloud computing<br />
have typically been scientific applications that are largely data parallel, since they are good candidates<br />
for cloud computing. These applications are primarily throughput-oriented (i.e., there is no tight coupling<br />
between tasks); the data requirements can be large but are well constrained; and some <strong>of</strong> these applications<br />
have complex s<strong>of</strong>tware pipelines, and thus can benefit from customized environments. Cloud computing<br />
systems typically provide greater flexibility to customize the environment when compared with traditional<br />
supercomputers and shared clusters, so these applications are particularly well suited to clouds.<br />
We outline select scientific case studies that have used <strong>Magellan</strong> successfully over the course <strong>of</strong> the project.<br />
The diversity <strong>of</strong> <strong>Magellan</strong> testbed setups enabled scientific users to explore a variety <strong>of</strong> cloud computing technologies<br />
and services, including bare-metal provisioning, virtual machines (VMs), and Hadoop. The <strong>Magellan</strong><br />
staff worked closely with the users to reduce the learning curve associated with these new technologies. In<br />
chapter 9 we outlined some <strong>of</strong> the performance studies from our applications. In this chapter, we focus on<br />
the advantages <strong>of</strong> using cloud environments, and we identify gaps and challenges in current cloud solutions.<br />
We discuss case studies in three areas <strong>of</strong> cloud usage: (a) bare metal provisioning (Section 11.1), (b)<br />
virtualized resources (Section 11.2), and c) Hadoop (Section 11.3). We discuss the suitability <strong>of</strong> cloud models<br />
for various applications and how design decisions were implemented in response to cloud characteristics.<br />
In section 11.4 we discuss gaps and challenges in using current cloud solutions for scientific applications,<br />
including feedback that we collected from the user community through a final survey.<br />
11.1 Bare-Metal Provisioning Case Studies<br />
Many scientific groups need dedicated access to computing resources for a specific period <strong>of</strong> time, <strong>of</strong>ten<br />
requiring custom s<strong>of</strong>tware environments. Virtual environments have been popularized by cloud computing<br />
technologies as a way to address these needs. However, virtual environments do not always work for scientific<br />
applications due to performance considerations (Chapter 9) or the difficulties with creating and maintaining<br />
images (Section 11.4). In this section, we highlight a set <strong>of</strong> diverse use cases that illustrate alternate<br />
provisioning models for applications that need on-demand access to resources.<br />
11.1.1 JGI Hardware Provisioning<br />
Hardware as a Service (HaaS) <strong>of</strong>fers one potential model for supporting the DOE’s scientific computing<br />
needs. This model was explored early with <strong>Magellan</strong> when a facility issue at the Joint Genome Institute<br />
(JGI) led to their need for rapid access to additional resources. NERSC responded to this by allocating 120<br />
nodes <strong>of</strong> <strong>Magellan</strong> to JGI.<br />
102