Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />
expand its compute capacity. Around the world, scientists were submitting additional E. coli genomes to<br />
the RAST servers to be annotated as word got out that DNA sequences were available. Additional backend<br />
computing instances were made available by the ALCF <strong>Magellan</strong> staff to keep up with demand for increased<br />
annotation. Such on-demand changes to handle the load and provide the fast turnaround were made possible<br />
by the <strong>Magellan</strong> cloud tools. This project was unique in that it utilized both bare-metal provisioning to<br />
leverage native hardware performance (provided by the Argonne bcfg2 and Heckle tools), as well as ALCF’s<br />
OpenStack cloud for populating the databases.<br />
Once the genomes were annotated, a comprehensive genome content analysis was done that required<br />
building molecular phylogenies (evolutionary trees) for each <strong>of</strong> the proteins in these new strains. These<br />
datasets enabled a detailed comparison to the nearly two hundred strains <strong>of</strong> E. coli that are in the public<br />
databases. This comparative analysis was quickly published on the internet and made available to the<br />
community. Overall, work that normally would have required many months was completed in less than<br />
three days by leveraging the on-demand computing capability <strong>of</strong> the ALCF <strong>Magellan</strong> cloud. This effort<br />
demonstrated the potential value in using cloud computing resources to quickly expand the capacity <strong>of</strong><br />
computational servers to respond to urgent increases in demand.<br />
11.2 Virtual Machine Case Studies<br />
In our second set <strong>of</strong> case studies, we outline some <strong>of</strong> applications that used the Eucalyptus and OpenStack<br />
clouds at both sites to harness a set <strong>of</strong> virtual machines (VMs) for their application needs. A number <strong>of</strong><br />
applications from diverse domains have used <strong>Magellan</strong>’s cloud computing infrastructure. A number <strong>of</strong> these<br />
applications tend to be data-parallel, since these are the most suitable to run in cloud environments. We<br />
outline the suitability, gaps, and challenges <strong>of</strong> virtual cloud environments for these select applications.<br />
11.2.1 STAR<br />
Figure 11.3: A plot <strong>of</strong> the status <strong>of</strong> processing near-real-time data from the STAR experiment using <strong>Magellan</strong><br />
at NERSC over a three-day period. The yellow circles show the number <strong>of</strong> instances running; the green<br />
triangles reflect the total load average across the instances; and the red squares plot the number <strong>of</strong> running<br />
processing tasks. (Image courtesy <strong>of</strong> Jan Balewski, STAR Collaboration).<br />
STAR is a nuclear physics experiment that studies fundamental properties <strong>of</strong> nuclear matter from the<br />
106