02.10.2015 Views

2010 Best Practices Competition IT & Informatics HPC

IT Informatics - Cambridge Healthtech Institute

IT Informatics - Cambridge Healthtech Institute

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Published Resources for the Life Sciences<br />

250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425<br />

UPPMAX’s users traditionally come from research areas such as physics, chemistry, and computer<br />

science. Lately, however, the number of Life Sciences users has increased dramatically. This is<br />

mainly due to the technical advances, affordability and increased deployment of next‐generation<br />

sequencing (NGS) machines.<br />

In 2008 it had become apparent, to Swedish researchers, that the tsunami of data from NGS<br />

systems created a problem that individual research grants could not solve. In many cases, Life<br />

Sciences research teams were trying to manage the problem themselves. However, due to the<br />

sheer volume of data, they wasted a lot of time copying data between systems, waiting for others<br />

to complete their computing before they could start their own ‐ and often writing custom code to<br />

manage jobs that would typically max‐out system resources. In short, the teams often spent as<br />

much time solving computing challenges as they did on scientific research!<br />

It was for these reasons that, in 2008, a national consortium of life sciences researchers was formed<br />

to address the challenges presented by this massive increase in bioinformatics data. These<br />

researchers would normally compete for resources and research funding. However, it had become<br />

apparent that a centralized facility was required. The computation and data storage requirements<br />

of NGS data created a workload that, at peak processing times and for long‐term data archiving, had<br />

to be handled by a larger facility.<br />

The consortium therefore submitted an application to SNIC and the Knut and Alice Wallenberg<br />

Foundation to fund a centralized life sciences compute and storage facility to be hosted at UPPMAX.<br />

The united conviction of the consortium being that a sufficient compute and storage facility would<br />

ultimately strengthen their attempts to combat disease.<br />

The application was successful with the Knut and Alice Wallenberg Foundation noting that the<br />

consortium’s collaborative effort was a major advantage.<br />

And so, the “UPPmax NEXt generation sequence Cluster & Storage (UPPNEX)” project was formed.<br />

Today, a 150 node (1200 core) compute cluster from HP with Infiniband as interconnect is in<br />

production with half‐a‐petabyte (500TBs) of Panasas parallel storage. The solution passed a onemonth<br />

acceptance period, at the first time of asking, and entered production in October 2009.<br />

The objectives of the UPPNEX solution were to provide Life Sciences Researchers throughout<br />

Sweden with:<br />

1. Sufficient high‐performance computing resources to cover their regular and peak project<br />

requirements

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!