01.06.2016 Views

Sequencing

SFAF2016%20Meeting%20Guide%20Final%203

SFAF2016%20Meeting%20Guide%20Final%203

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting<br />

TINK: A NOVEL EUKARYOTIC EVIDENCE BASED<br />

PAN-TRANSCRIPTOME GENERATION PIPELINE<br />

Wednesday, 1st June 18:30 La Fonda NM Room (1st floor) Poster (PS‐1a.03)<br />

Chandler Roe, Jason Travis, Nathan Hicks, Elizabeth Driebe, David Engelthaler, Paul Keim<br />

TGen North<br />

While next generation sequencing has become an increasingly easy laboratory procedure, eukaryotic<br />

genome annotation is still a challenging bioinformatic task. High‐throughput mRNA sequencing<br />

(RNA‐Seq) platforms allow for a variety of applications such as novel transcript and isoform discovery,<br />

expression estimate analysis, alternative splicing as well as exploration of non‐model‐organism<br />

transcriptomes. However, the required genome assembly and annotation is a complicated and timeconsuming<br />

process that requires multiple steps and command line skills. Our pipeline, TINK, generates<br />

an evidence based pan‐transcriptome reference to be used for RNA‐Seq analysis. It provides<br />

a rapid, all encompassing, one‐time analysis that allows for discovery of unique transcripts. This<br />

pipeline combines ab initio gene prediction using the program AUGUSTUS, protein homology prediction<br />

utilizing AAT and de novo RNASeq assemblies using both PASA and Trinity. These results<br />

are weighted and combined using EvidenceModeler to create individual genome annotations for<br />

each sequenced sample and further compiles, clusters and de‐replicates these annotations to create<br />

a novel pan‐transcriptome reference. We have used this technique to explore differential expression<br />

and identify novel transcripts from the fungal pathogen Cryptococcus gatti. This pathogen has been<br />

characterized into four types, I‐IV, within which subgroups exist. In order to capture transcripts<br />

unique to one subtype of C. gattii as well as differing expression levels, multiple analyzes would need<br />

to be performed using a different reference each time, which is both computationally expensive and<br />

time consuming. TINK provided a reference to allow a single analysis on this data, greatly reducing<br />

time and resources.<br />

37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!