02.10.2015 Views

2010 Best Practices Competition IT & Informatics HPC

IT Informatics - Cambridge Healthtech Institute

IT Informatics - Cambridge Healthtech Institute

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

[ROI Expected or Achieved]<br />

Figure 4 March <strong>2010</strong> Nextgen sequencing infrastructure pipeline<br />

Highly scalable <strong>IT</strong> infrastructure supporting high-throughput NextGen sequencing data processing and<br />

analysis<br />

1. High-speed, shared file transfer infrastructure that enables TGen scientists to participate in largescale<br />

collaborations involving NextGen sequencing-based research<br />

2. Improved data management procedures resulting in a more cost effective use of storage and<br />

other infrastructure resources<br />

3. Efficient scientific data processing workflow including computational tools that can be leveraged<br />

to expedite research<br />

4. Robust <strong>HPC</strong> infrastructure that is capable of supporting large-scale NextGen sequencing projects<br />

As a result of the above benefits, TGen is better positioned to compete in large-scale grants and<br />

contracts involving NextGen sequencing technology.<br />

[Conclusions]<br />

In spite of resource limitations, infrastructure constraints and a relatively short time to carry out the largescale<br />

sequencing data analysis, TGen has successfully aligned approximately 270 Giga-bases out of 550<br />

Giga-bases processed against the human genome.<br />

Throughout 2009, several new research groups at TGen incorporated NextGen sequencing technologies<br />

into their research, consequently the number of bioinformatics personnel carrying out NextGen data<br />

analysis is increasing. Concurrently, the number of sequencers at TGen has gone from two to seven (two<br />

SOLEXA and five SOLiD). TGen expects to add six more SOLiD sequencers in early <strong>2010</strong>. The<br />

throughput of each sequencer at TGen has more than doubled relative to early 2009 and this trend is<br />

expected to continue or even accelerate. Large volumes of data generated by external collaborators and

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!