Computational tools and Interoperability in Comparative ... - CBS
Computational tools and Interoperability in Comparative ... - CBS
Computational tools and Interoperability in Comparative ... - CBS
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
HIGHLIGHT www.rsc.org/molecularbiosystems | Molecular BioSystems<br />
The genome BLASTatlas—a GeneWiz<br />
extension for visualization of whole-genome<br />
homology<br />
Peter F. Hall<strong>in</strong>, Tim T. B<strong>in</strong>newies* <strong>and</strong> David W. Ussery<br />
DOI: 10.1039/b717118h<br />
The development of fast <strong>and</strong> <strong>in</strong>expensive methods for sequenc<strong>in</strong>g bacterial genomes<br />
has led to a wealth of data, often with many genomes be<strong>in</strong>g sequenced of the same<br />
species or closely related organisms. Thus, there is a need for visualization methods that<br />
will allow easy comparison of many sequenced genomes to a def<strong>in</strong>ed reference stra<strong>in</strong>.<br />
The BLASTatlas is one such tool that is useful for mapp<strong>in</strong>g <strong>and</strong> visualiz<strong>in</strong>g whole<br />
genome homology of genes <strong>and</strong> prote<strong>in</strong>s with<strong>in</strong> a reference stra<strong>in</strong> compared to other<br />
stra<strong>in</strong>s or species of one or more prokaryotic organisms. We provide examples of<br />
BLASTatlases, <strong>in</strong>clud<strong>in</strong>g the Clostridium tetani plasmid p88, where homologues for tox<strong>in</strong><br />
genes can be easily visualized <strong>in</strong> other sequenced Clostridium genomes, <strong>and</strong> for a<br />
Clostridium botul<strong>in</strong>um genome, compared to 14 other Clostridium genomes. DNA<br />
structural <strong>in</strong>formation is also <strong>in</strong>cluded <strong>in</strong> the atlas to visualize the DNA chromosomal<br />
context of regions. Additional <strong>in</strong>formation can be added to these plots, <strong>and</strong> as an<br />
example we have added circles show<strong>in</strong>g the probability of the DNA helix open<strong>in</strong>g up<br />
under superhelical tension. The tool is SOAP compliant <strong>and</strong> WSDL (web services<br />
description language) files are located on our website: (http://www.cbs.dtu.dk/ws/<br />
BLASTatlas), where programm<strong>in</strong>g examples are available <strong>in</strong> Perl. By provid<strong>in</strong>g an<br />
<strong>in</strong>teroperable method to carry out whole genome visualization of homology,<br />
this service offers bio<strong>in</strong>formaticians as well as biologists an easy-to-adopt workflow<br />
that can be directly called from the programm<strong>in</strong>g language of the user, hence<br />
enabl<strong>in</strong>g automation of repeated tasks. This tool can be relevant <strong>in</strong> many pangenomic<br />
as well as <strong>in</strong> metagenomic studies, by giv<strong>in</strong>g a quick overview of clusters of<br />
<strong>in</strong>sertion sites, genomic isl<strong>and</strong>s <strong>and</strong> overall homology between a reference<br />
sequence <strong>and</strong> a data set.<br />
Center for Biological Sequence Analysis,<br />
Department of Systems Biology, The<br />
Technical University of Denmark, 2800<br />
Lyngby, Denmark. E-mail: pfh@cbs.dtu.dk.<br />
E-mail: tim@cbs.dtu.dk. E-mail:<br />
dave@cbs.dtu.dk<br />
Background<br />
It has been more than 10 years s<strong>in</strong>ce the<br />
sequenc<strong>in</strong>g of the first bacterial genome<br />
(ref. 1, US patent number 6,528,289), <strong>and</strong><br />
currently sequence data are available for<br />
more than a thous<strong>and</strong> sequenced genomes.<br />
Peter F. Hall<strong>in</strong> Tim T. B<strong>in</strong>newies David W. Ussery<br />
With so many genome sequences, for<br />
several bacterial species multiple genome<br />
sequences exist; for example, at the time<br />
of writ<strong>in</strong>g, 10 different Escherichia coli<br />
genomes have been fully sequenced <strong>and</strong><br />
published, <strong>and</strong> draft sequences for another<br />
31 genomes are available, add<strong>in</strong>g<br />
Peter F. Hall<strong>in</strong> was born <strong>in</strong><br />
Odense, Denmark, <strong>and</strong> is currently<br />
a PhD student at <strong>CBS</strong>,<br />
DTU. Tim T. B<strong>in</strong>newies grew<br />
up <strong>in</strong> Kiel, Germany, <strong>and</strong> obta<strong>in</strong>ed<br />
his PhD from the Technical<br />
University of Denmark,<br />
he is currently work<strong>in</strong>g for<br />
Roche Diagnostics AG <strong>in</strong> Switzerl<strong>and</strong>.<br />
David W. Ussery was<br />
born <strong>and</strong> raised <strong>in</strong> Spr<strong>in</strong>gdale,<br />
Arkansas. S<strong>in</strong>ce 1998, he has<br />
been leader for the <strong>Comparative</strong><br />
Genomics group at <strong>CBS</strong>.<br />
This journal is c The Royal Society of Chemistry 2008 Mol. BioSyst., 2008, 4, 363–371 | 363