Abstracts
ngsfinalprogram
ngsfinalprogram
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Oral Presentation <strong>Abstracts</strong><br />
all of them belonged to B. melitensis biovar<br />
2 str. 63/9. A neighbor-joining tree analysis<br />
identified one of the isolates as an outlier. Furthermore,<br />
variations (SNPs and indels) were<br />
spread all over the genome; but 138 SNPs<br />
were common among the 14 isolates, supporting<br />
the same ancestral origin. In addition,<br />
SNPs (2 - 478) unique to each isolate were<br />
also identified, which divided the B. melitensis<br />
biovar 2 into two major variant groups. In<br />
conclusion, this study suggest that biovar 2 is<br />
the most prevalent biovar of B. melitensis in<br />
Kuwait. Furthermore, at least two major variant<br />
groups exist within biovar 2. Supported<br />
by Kuwait University Research Sector grant<br />
SRUL02/13.<br />
n S6:4<br />
MICROBIAL GENOMIC TAXONOMY AT<br />
GENBANK<br />
S. Federhen;<br />
NCBI, Bethesda, MD.<br />
Incorrectly identified genomes at GenBank<br />
are a problem for users of the data. Some<br />
genomes are submitted with incorrect species<br />
identifications. Others were correctly identified<br />
when they were submitted but should now<br />
be updated based on a subsequent taxonomic<br />
publication, for example the description of a<br />
new species. GenBank has traditionally relied<br />
on the submitters to provide the correct<br />
taxonomic identifications for their sequence<br />
submissions. Two developments have combined<br />
to change this situation in the domain<br />
of microbial genomes. First, the curation of<br />
type material in the NCBI taxonomy database<br />
allows us to flag sequences from type in the<br />
nucleotide and genome domains of Entrez.<br />
Second, current sequencing technology makes<br />
it fast and easy to generate microbial genomes.<br />
It has been clear for some time that the current<br />
paradigm of species delimitation by 16S rRNA<br />
sequence and DNA-DNA hybridization (DDH)<br />
would eventually be replaced with a model<br />
based on whole genome analysis. We present<br />
a proposal to find and correct misidentified<br />
genomes based on average nucleotide identity<br />
(ANI) from type and proxytype. Sequences<br />
from type are reliably identified (by definition)<br />
once we have verified that they are free from<br />
contamination and are actually from the strain<br />
with which they are annotated. All other identifications<br />
are a matter of opinion, and will be<br />
subject to verification. We have genomes from<br />
type (both finished and WGS) for 4000 species,<br />
including 3500 bacteria. This represents<br />
25% of bacterial species with validly published<br />
names. The other 75% of bacterial species<br />
will generally have an assortment of short sequences<br />
from type in GenBank - at least a 16S<br />
sequence, but often more. These sequences are<br />
used to probe our existing genomes and predict<br />
where the genome from type will appear once<br />
we do get one. In many cases we can designate<br />
a proxy for the missing type from among<br />
the genomes that we do have - we call these<br />
‘proxytype’ genomes. Taken together, these<br />
genomes from type and proxytype represent a<br />
scaffold of reliably identified sequences that<br />
we can use in conjunction with some simple<br />
genome-wide comparison measures to validate<br />
the identifications in our other genomes.<br />
Once we have identified genomes that need<br />
taxonomic updates, we plan to correct the entries,<br />
add a structured comment detailing the<br />
evidence for the update, and notify the submitters<br />
of the change. This represents a significant<br />
change in policy for GenBank - a new genomic<br />
paradigm for validating taxonomic identifications,<br />
some new types of analysis, as well as<br />
a shift in the boundary for database-driven<br />
source feature updates. We convened a workshop<br />
to present the proposal, with representation<br />
from a broad spectrum of the bacterial<br />
taxonomic community (GenBank genomic<br />
taxonomy workshop, 12-13 May 2015). This<br />
group unanimously endorsed our genomic approach<br />
to validating taxonomic identifications<br />
in genomes at GenBank.<br />
ASM Conference on Rapid Next-Generation Sequencing and Bioinformatic<br />
Pipelines for Enhanced Molecular Epidemiologic Investigation of Pathogens<br />
25