29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Key Findings<br />

The goal <strong>of</strong> the <strong>Magellan</strong> project is to determine the appropriate role <strong>of</strong> cloud computing in addressing<br />

the computing needs <strong>of</strong> scientists funded by the DOE <strong>Office</strong> <strong>of</strong> <strong>Science</strong>. During the course <strong>of</strong> the <strong>Magellan</strong><br />

project, we have evaluated various aspects <strong>of</strong> cloud computing infrastructure and technologies for use<br />

by scientific applications from various domains. Our evaluation methodology covered various dimensions:<br />

cloud models such as Infrastructure as a Service (IaaS) and Platform as a Service (PaaS), virtual s<strong>of</strong>tware<br />

stacks, MapReduce and its open source implementation (Hadoop), resource provider and user perspectives.<br />

Specifically, <strong>Magellan</strong> was charged with answering the following research questions:<br />

• Are the open source cloud s<strong>of</strong>tware stacks ready for DOE HPC science<br />

• Can DOE cyber security requirements be met within a cloud<br />

• Are the new cloud programming models useful for scientific computing<br />

• Can DOE HPC applications run efficiently in the cloud What applications are suitable for clouds<br />

• How usable are cloud environments for scientific applications<br />

• When is it cost effective to run DOE HPC science in a cloud<br />

We summarize our findings here:<br />

Finding 1. Scientific applications have special requirements that require solutions that are<br />

tailored to these needs.<br />

Cloud computing has developed in the context <strong>of</strong> enterprise and web applications that have vastly different<br />

requirements compared to scientific applications. Scientific applications <strong>of</strong>ten rely on access to large<br />

legacy data sets and pre-tuned application s<strong>of</strong>tware libraries. These applications today run in HPC centers<br />

with low-latency interconnects and rely on parallel file systems. While these applications could benefit from<br />

cloud features such as customized environments and rapid elasticity, these need to be in concert with other<br />

capabilities that are currently available to them in supercomputing centers. In addition, the cost model for<br />

scientific users is based on account allocations rather than a fungible commodity such as dollars; and the<br />

business model <strong>of</strong> scientific processes leads to an open-ended need for resources. These differences in cost<br />

and the business model for scientific computing necessitates a different approach compared to the enterprise<br />

model that cloud services cater to today. Coupled with the unique s<strong>of</strong>tware and specialized hardware requirements,<br />

this points to the need for clouds designed and operated specifically for scientific users. These<br />

requirements could be met at current DOE HPC centers. Private science clouds should be considered only<br />

if it is found that requirements cannot be met by the additional services at HPC centers.<br />

Finding 2. Scientific applications with minimal communication and I/O are best suited for<br />

clouds.<br />

We have used a range <strong>of</strong> application benchmarks and micro-benchmarks to understand the performance<br />

<strong>of</strong> scientific applications. Performance <strong>of</strong> tightly coupled applications running on virtualized clouds using<br />

commodity networks can be significantly lower than on clusters optimized for these workloads. This can be<br />

true at even mid-range computing scales. For example, we observed slowdowns <strong>of</strong> about 50x for PARATEC<br />

on the Amazon EC2 instances compared to <strong>Magellan</strong> bare-metal (non-virtualized) at 64 cores and about 7x<br />

iii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!