Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />
cloud environments. Similarly, I/O intensive applications take a substantial hit when run inside virtual<br />
environments.<br />
• Cloud programming models such as MapReduce and the resulting ecosystem show promise for addressing<br />
the needs <strong>of</strong> many data-intensive and high-throughput scientific applications. However, current<br />
tools have gaps for scientific applications. The MapReduce model emphasizes the data locality<br />
and fault tolerance that are important in large systems. Thus there is a need for tools that provide<br />
MapReduce implementations which are tuned for scientific applications.<br />
• Current cloud tools do not provide an out-<strong>of</strong>-box solution to address application needs. There is<br />
significant design and programming required to manage the data and workflows in these environments.<br />
Virtual machine environments require users to configure and create their s<strong>of</strong>tware images with all<br />
necessary packages. Scientific groups will also need to maintain these images with security patches and<br />
application updates. There exist a number <strong>of</strong> performance, reliability, and portability challenges with<br />
cloud images that users must consider carefully. There are limited user-side tools available today to<br />
manage cloud environments.<br />
• One way clouds can achieve cost efficiency is though consolidation <strong>of</strong> resources and higher average<br />
utilization. DOE Centers already consolidate workloads from different scientific domains and have<br />
high average utilization, typically greater than 90%. Even with conservative cost analysis, we show<br />
how two DOE Centers are more cost-efficient than private clouds.<br />
As noted in the final point, a key benefit <strong>of</strong> clouds is the consolidation <strong>of</strong> resources. This typically leads to<br />
higher utilization, improved operational efficiency, and lower acquisition cost through increased purchasing<br />
power. If one looks across the scientific computing landscape within DOE there are variety <strong>of</strong> models for how<br />
scientists access computing resources. These cover the full range <strong>of</strong> consolidation and utilization scales. At<br />
one end <strong>of</strong> the spectrum is the small group or departmental cluster. These systems are <strong>of</strong>ten under-utilized<br />
and represent the best opportunity to achieve better efficiency. Many <strong>of</strong> the DOE National Laboratories<br />
have already taken efforts to consolidate these resources into institutional clusters operating under a variety<br />
<strong>of</strong> business models (institutionally funded, buy-in/condo, etc.). In many ways, these systems act as private<br />
clouds tuned for scientific applications and effectively achieve many <strong>of</strong> the benefits <strong>of</strong> cloud computing. DOE<br />
HPC centers provide the next level consolidation, since these facilities serve users from many institutions<br />
and scientific domains. This level <strong>of</strong> consolidation is one reason why many <strong>of</strong> the DOE HPC centers operate<br />
at high levels <strong>of</strong> utilization.<br />
Clouds have certain features that are attractive for scientific groups needing support for on-demand access<br />
to resources, sudden surges in resource needs, customized environments, periodic predictable resource needs<br />
(e.g., monthly processing <strong>of</strong> genome data, nightly processing <strong>of</strong> telescope data), or unpredictable events such<br />
as computing for disaster recovery. Cloud services essentially provide a differentiated service model that<br />
can cater to these diverse needs, allowing users to get a virtual private cluster with a certain guaranteed<br />
level <strong>of</strong> service. Clouds are also attractive to high-throughput and data-intensive workloads that do not fit<br />
in current-day scheduling and allocation policies at supercomputing centers. DOE labs and centers should<br />
consider adopting and integrating features <strong>of</strong> cloud computing into their operations in order to support<br />
more diverse workloads and further enable scientific discovery. This includes mechanisms to support more<br />
customized environments, but also methods <strong>of</strong> providing more on-demand access to cycles. This could be<br />
achieved by: a) maintaining idle hardware at additional costs to satisfy potential future requests, b) sharing<br />
cores/nodes typically at a performance cost to the user, and c) utilizing different scheduling policies such<br />
as preemption. Providing these capabilities would address many <strong>of</strong> the motivations that lead scientists<br />
to consider cloud computing while still preserving the benefits <strong>of</strong> typical HPC systems which are already<br />
optimized for scientific applications.<br />
Cloud computing is essentially a business model that emphasizes on-demand access to resources and<br />
cost-savings through consolidation <strong>of</strong> resources. Overall, whether cloud computing is suitable for a certain<br />
application, science group, or community depends on a number <strong>of</strong> factors. This was noted in NIST’s draft<br />
document on Cloud Computing, “Cloud Computing Synopsis and Recommendations”, which stated,<br />
130