18.09.2015 Views

Abstracts

ngsfinalprogram

ngsfinalprogram

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Poster <strong>Abstracts</strong><br />

the results are presented. Analysis results are<br />

shown in a highly accessible manner, allowing<br />

the user to gain a quick overview as well<br />

as permitting deep analysis. The performance<br />

of the PAIPline was benchmarked on real and<br />

artificial datasets of known compositions and<br />

compared to competing tools. The results and<br />

discussed features show that the presented approach<br />

is a viable strategy for the identification<br />

of pathogen sequences in NGS datasets.<br />

n 10<br />

SEPARATION OF FOREGROUND AND<br />

BACKGROUND READS IN MIXED NGS<br />

DATASETS<br />

S. Tausch, A. Nitsche, B. Renard, P. Dabrowski;<br />

Robert Koch Institute, Berlin, GERMANY.<br />

NGS is a valuable technology for rapid and indepth<br />

analysis of clinical samples, as it allows<br />

sequencing of a pathogen’s whole genome<br />

directly from patient material within as little<br />

as 26 hours. However, the follow-up analysis<br />

is severely slowed down by the abundance of<br />

reads originating from the host. Thus, in order<br />

to exploit the full potential of the technology<br />

for rapid diagnostics, a method for rapid in<br />

silico removal of host reads is necessary. Commonly,<br />

a mapping-based approach is used to<br />

separate reads: either reads mapping to a background<br />

reference or reads not mapping to a<br />

foreground reference are discarded. However,<br />

while the former approach is highly specific<br />

in discarding only true background reads and<br />

the latter is highly sensitive in only keeping<br />

foreground reads, neither offers a good balance.<br />

Hence we have aimed at developing a<br />

novel tool specifically geared towards both<br />

specific and sensitive separation of foreground<br />

and background reads. In order to determine<br />

whether a read belongs to the foreground or<br />

the background, we train markov chains of<br />

an order k from 4 to 12 on user-provided sets<br />

of foreground and background reference sequences,<br />

where each state is a k-mer of length<br />

k and each transition is one of the four possible<br />

bases A, C, G and T. We then calculate the<br />

difference of log likelihoods of each transition<br />

observed within a read with regards to<br />

the foreground and the background markov<br />

chains. This difference is then used as a score<br />

for the separation of reads, with scores smaller<br />

than 0 indicating a background read and scores<br />

larger than 0 indicating a foreground read.<br />

We have tested our tool on several datasets,<br />

including Cowpoxvirus sequenced from a<br />

human host. In all cases, our tool was faster<br />

than any competing tool (achieving speeds of<br />

up to 10 Megabases/second using 4 CPUs),<br />

including Kraken and mapping via bowtie2.<br />

At the same time, we consistently achieved<br />

the best F-Score of all tested tools. Our tool is<br />

developed in python and java and available for<br />

download from http://sourceforge.net/projects/<br />

rambok/ We have developed a freely available,<br />

easy to use, rapid and both highly sensitive and<br />

specific tool for the separation of foreground<br />

and background reads in mixed NGS datasets.<br />

We believe that this will be highly useful as an<br />

initial filtering step for anyone analyzing viral<br />

sequences via NGS.<br />

n 11<br />

A RAPID AND SCALABLE SINGLE<br />

NUCLEOTIDE POLYMORPHISM DISCOVERY<br />

AND VALIDATION PIPELINE FOR OUTBREAK<br />

INVESTIGATION OF BACTERIAL PATHOGENS<br />

B. Rusconi 1 , A. L. Rodriguez 2 , S. S. Koenig 1 ,<br />

M. Eppinger 1 ;<br />

1<br />

University of Texas at San Antonio - South<br />

Texas Center For Emerging Infectious Diseases<br />

(STCEID), San Antonio, TX, 2 University of<br />

Texas at San Antonio -Computational Biology<br />

Initiative, San Antonio, TX.<br />

Background: Assuring a timely and effective<br />

response in the control of bacterial outbreaks<br />

is challenging, as discriminatory power becomes<br />

of particular importance to distinguish<br />

outbreak isolates that form tight clonal complexes<br />

with only few genetic polymorphisms.<br />

The increase of throughput and concomitant<br />

ASM Conference on Rapid Next-Generation Sequencing and Bioinformatic<br />

Pipelines for Enhanced Molecular Epidemiologic Investigation of Pathogens<br />

45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!