2010 Best Practices Competition IT & Informatics HPC
IT Informatics - Cambridge Healthtech Institute
IT Informatics - Cambridge Healthtech Institute
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Published Resources for the Life Sciences<br />
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425<br />
UPPMAX’s users traditionally come from research areas such as physics, chemistry, and computer<br />
science. Lately, however, the number of Life Sciences users has increased dramatically. This is<br />
mainly due to the technical advances, affordability and increased deployment of next‐generation<br />
sequencing (NGS) machines.<br />
In 2008 it had become apparent, to Swedish researchers, that the tsunami of data from NGS<br />
systems created a problem that individual research grants could not solve. In many cases, Life<br />
Sciences research teams were trying to manage the problem themselves. However, due to the<br />
sheer volume of data, they wasted a lot of time copying data between systems, waiting for others<br />
to complete their computing before they could start their own ‐ and often writing custom code to<br />
manage jobs that would typically max‐out system resources. In short, the teams often spent as<br />
much time solving computing challenges as they did on scientific research!<br />
It was for these reasons that, in 2008, a national consortium of life sciences researchers was formed<br />
to address the challenges presented by this massive increase in bioinformatics data. These<br />
researchers would normally compete for resources and research funding. However, it had become<br />
apparent that a centralized facility was required. The computation and data storage requirements<br />
of NGS data created a workload that, at peak processing times and for long‐term data archiving, had<br />
to be handled by a larger facility.<br />
The consortium therefore submitted an application to SNIC and the Knut and Alice Wallenberg<br />
Foundation to fund a centralized life sciences compute and storage facility to be hosted at UPPMAX.<br />
The united conviction of the consortium being that a sufficient compute and storage facility would<br />
ultimately strengthen their attempts to combat disease.<br />
The application was successful with the Knut and Alice Wallenberg Foundation noting that the<br />
consortium’s collaborative effort was a major advantage.<br />
And so, the “UPPmax NEXt generation sequence Cluster & Storage (UPPNEX)” project was formed.<br />
Today, a 150 node (1200 core) compute cluster from HP with Infiniband as interconnect is in<br />
production with half‐a‐petabyte (500TBs) of Panasas parallel storage. The solution passed a onemonth<br />
acceptance period, at the first time of asking, and entered production in October 2009.<br />
The objectives of the UPPNEX solution were to provide Life Sciences Researchers throughout<br />
Sweden with:<br />
1. Sufficient high‐performance computing resources to cover their regular and peak project<br />
requirements