29.07.2013 Views

Computational tools and Interoperability in Comparative ... - CBS

Computational tools and Interoperability in Comparative ... - CBS

Computational tools and Interoperability in Comparative ... - CBS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

HIGHLIGHT www.rsc.org/molecularbiosystems | Molecular BioSystems<br />

The genome BLASTatlas—a GeneWiz<br />

extension for visualization of whole-genome<br />

homology<br />

Peter F. Hall<strong>in</strong>, Tim T. B<strong>in</strong>newies* <strong>and</strong> David W. Ussery<br />

DOI: 10.1039/b717118h<br />

The development of fast <strong>and</strong> <strong>in</strong>expensive methods for sequenc<strong>in</strong>g bacterial genomes<br />

has led to a wealth of data, often with many genomes be<strong>in</strong>g sequenced of the same<br />

species or closely related organisms. Thus, there is a need for visualization methods that<br />

will allow easy comparison of many sequenced genomes to a def<strong>in</strong>ed reference stra<strong>in</strong>.<br />

The BLASTatlas is one such tool that is useful for mapp<strong>in</strong>g <strong>and</strong> visualiz<strong>in</strong>g whole<br />

genome homology of genes <strong>and</strong> prote<strong>in</strong>s with<strong>in</strong> a reference stra<strong>in</strong> compared to other<br />

stra<strong>in</strong>s or species of one or more prokaryotic organisms. We provide examples of<br />

BLASTatlases, <strong>in</strong>clud<strong>in</strong>g the Clostridium tetani plasmid p88, where homologues for tox<strong>in</strong><br />

genes can be easily visualized <strong>in</strong> other sequenced Clostridium genomes, <strong>and</strong> for a<br />

Clostridium botul<strong>in</strong>um genome, compared to 14 other Clostridium genomes. DNA<br />

structural <strong>in</strong>formation is also <strong>in</strong>cluded <strong>in</strong> the atlas to visualize the DNA chromosomal<br />

context of regions. Additional <strong>in</strong>formation can be added to these plots, <strong>and</strong> as an<br />

example we have added circles show<strong>in</strong>g the probability of the DNA helix open<strong>in</strong>g up<br />

under superhelical tension. The tool is SOAP compliant <strong>and</strong> WSDL (web services<br />

description language) files are located on our website: (http://www.cbs.dtu.dk/ws/<br />

BLASTatlas), where programm<strong>in</strong>g examples are available <strong>in</strong> Perl. By provid<strong>in</strong>g an<br />

<strong>in</strong>teroperable method to carry out whole genome visualization of homology,<br />

this service offers bio<strong>in</strong>formaticians as well as biologists an easy-to-adopt workflow<br />

that can be directly called from the programm<strong>in</strong>g language of the user, hence<br />

enabl<strong>in</strong>g automation of repeated tasks. This tool can be relevant <strong>in</strong> many pangenomic<br />

as well as <strong>in</strong> metagenomic studies, by giv<strong>in</strong>g a quick overview of clusters of<br />

<strong>in</strong>sertion sites, genomic isl<strong>and</strong>s <strong>and</strong> overall homology between a reference<br />

sequence <strong>and</strong> a data set.<br />

Center for Biological Sequence Analysis,<br />

Department of Systems Biology, The<br />

Technical University of Denmark, 2800<br />

Lyngby, Denmark. E-mail: pfh@cbs.dtu.dk.<br />

E-mail: tim@cbs.dtu.dk. E-mail:<br />

dave@cbs.dtu.dk<br />

Background<br />

It has been more than 10 years s<strong>in</strong>ce the<br />

sequenc<strong>in</strong>g of the first bacterial genome<br />

(ref. 1, US patent number 6,528,289), <strong>and</strong><br />

currently sequence data are available for<br />

more than a thous<strong>and</strong> sequenced genomes.<br />

Peter F. Hall<strong>in</strong> Tim T. B<strong>in</strong>newies David W. Ussery<br />

With so many genome sequences, for<br />

several bacterial species multiple genome<br />

sequences exist; for example, at the time<br />

of writ<strong>in</strong>g, 10 different Escherichia coli<br />

genomes have been fully sequenced <strong>and</strong><br />

published, <strong>and</strong> draft sequences for another<br />

31 genomes are available, add<strong>in</strong>g<br />

Peter F. Hall<strong>in</strong> was born <strong>in</strong><br />

Odense, Denmark, <strong>and</strong> is currently<br />

a PhD student at <strong>CBS</strong>,<br />

DTU. Tim T. B<strong>in</strong>newies grew<br />

up <strong>in</strong> Kiel, Germany, <strong>and</strong> obta<strong>in</strong>ed<br />

his PhD from the Technical<br />

University of Denmark,<br />

he is currently work<strong>in</strong>g for<br />

Roche Diagnostics AG <strong>in</strong> Switzerl<strong>and</strong>.<br />

David W. Ussery was<br />

born <strong>and</strong> raised <strong>in</strong> Spr<strong>in</strong>gdale,<br />

Arkansas. S<strong>in</strong>ce 1998, he has<br />

been leader for the <strong>Comparative</strong><br />

Genomics group at <strong>CBS</strong>.<br />

This journal is c The Royal Society of Chemistry 2008 Mol. BioSyst., 2008, 4, 363–371 | 363

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!