29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

expand its compute capacity. Around the world, scientists were submitting additional E. coli genomes to<br />

the RAST servers to be annotated as word got out that DNA sequences were available. Additional backend<br />

computing instances were made available by the ALCF <strong>Magellan</strong> staff to keep up with demand for increased<br />

annotation. Such on-demand changes to handle the load and provide the fast turnaround were made possible<br />

by the <strong>Magellan</strong> cloud tools. This project was unique in that it utilized both bare-metal provisioning to<br />

leverage native hardware performance (provided by the Argonne bcfg2 and Heckle tools), as well as ALCF’s<br />

OpenStack cloud for populating the databases.<br />

Once the genomes were annotated, a comprehensive genome content analysis was done that required<br />

building molecular phylogenies (evolutionary trees) for each <strong>of</strong> the proteins in these new strains. These<br />

datasets enabled a detailed comparison to the nearly two hundred strains <strong>of</strong> E. coli that are in the public<br />

databases. This comparative analysis was quickly published on the internet and made available to the<br />

community. Overall, work that normally would have required many months was completed in less than<br />

three days by leveraging the on-demand computing capability <strong>of</strong> the ALCF <strong>Magellan</strong> cloud. This effort<br />

demonstrated the potential value in using cloud computing resources to quickly expand the capacity <strong>of</strong><br />

computational servers to respond to urgent increases in demand.<br />

11.2 Virtual Machine Case Studies<br />

In our second set <strong>of</strong> case studies, we outline some <strong>of</strong> applications that used the Eucalyptus and OpenStack<br />

clouds at both sites to harness a set <strong>of</strong> virtual machines (VMs) for their application needs. A number <strong>of</strong><br />

applications from diverse domains have used <strong>Magellan</strong>’s cloud computing infrastructure. A number <strong>of</strong> these<br />

applications tend to be data-parallel, since these are the most suitable to run in cloud environments. We<br />

outline the suitability, gaps, and challenges <strong>of</strong> virtual cloud environments for these select applications.<br />

11.2.1 STAR<br />

Figure 11.3: A plot <strong>of</strong> the status <strong>of</strong> processing near-real-time data from the STAR experiment using <strong>Magellan</strong><br />

at NERSC over a three-day period. The yellow circles show the number <strong>of</strong> instances running; the green<br />

triangles reflect the total load average across the instances; and the red squares plot the number <strong>of</strong> running<br />

processing tasks. (Image courtesy <strong>of</strong> Jan Balewski, STAR Collaboration).<br />

STAR is a nuclear physics experiment that studies fundamental properties <strong>of</strong> nuclear matter from the<br />

106

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!