2010 Best Practices Competition IT & Informatics HPC
IT Informatics - Cambridge Healthtech Institute
IT Informatics - Cambridge Healthtech Institute
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
[ROI Expected or Achieved]<br />
Figure 4 March <strong>2010</strong> Nextgen sequencing infrastructure pipeline<br />
Highly scalable <strong>IT</strong> infrastructure supporting high-throughput NextGen sequencing data processing and<br />
analysis<br />
1. High-speed, shared file transfer infrastructure that enables TGen scientists to participate in largescale<br />
collaborations involving NextGen sequencing-based research<br />
2. Improved data management procedures resulting in a more cost effective use of storage and<br />
other infrastructure resources<br />
3. Efficient scientific data processing workflow including computational tools that can be leveraged<br />
to expedite research<br />
4. Robust <strong>HPC</strong> infrastructure that is capable of supporting large-scale NextGen sequencing projects<br />
As a result of the above benefits, TGen is better positioned to compete in large-scale grants and<br />
contracts involving NextGen sequencing technology.<br />
[Conclusions]<br />
In spite of resource limitations, infrastructure constraints and a relatively short time to carry out the largescale<br />
sequencing data analysis, TGen has successfully aligned approximately 270 Giga-bases out of 550<br />
Giga-bases processed against the human genome.<br />
Throughout 2009, several new research groups at TGen incorporated NextGen sequencing technologies<br />
into their research, consequently the number of bioinformatics personnel carrying out NextGen data<br />
analysis is increasing. Concurrently, the number of sequencers at TGen has gone from two to seven (two<br />
SOLEXA and five SOLiD). TGen expects to add six more SOLiD sequencers in early <strong>2010</strong>. The<br />
throughput of each sequencer at TGen has more than doubled relative to early 2009 and this trend is<br />
expected to continue or even accelerate. Large volumes of data generated by external collaborators and