Computational tools and Interoperability in Comparative ... - CBS
Computational tools and Interoperability in Comparative ... - CBS
Computational tools and Interoperability in Comparative ... - CBS
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
. Table 1<br />
Methods for comparison of bacterial genomes<br />
Method URL References<br />
Length, %GC http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi Wheeler et al. (2007)<br />
Chromosome<br />
alignment (ACT)<br />
Chromosome<br />
alignment (MUMMER)<br />
http://www.sanger.ac.uk/Software/ACT/ Carver et al. (2005)<br />
http://www.webact.org/WebACT/home<br />
http://mummer.sourceforge.net Kurtz et al. (2004)<br />
Repeats – various http://www.cbs.dtu.dk/services/GenomeAtlas Ussery et al. (2004)<br />
Repeats –<br />
tetranucleotides<br />
Repeats – short,<br />
t<strong>and</strong>em<br />
Tools for Comparison of Bacterial Genomes 74<br />
http://www.megx.net/tetra Teel<strong>in</strong>g et al. (2004)<br />
http://m<strong>in</strong>isatellites.u-psud.fr/GPMS/default.php Denoeud <strong>and</strong><br />
Vergnaud (2004)<br />
Repeats – VNTRs http://vntr.csie.ntu.edu.tw Chang et al. (2007)<br />
Replication Orig<strong>in</strong>s http://www.cbs.dtu.dk/services/GenomeAtlas Worn<strong>in</strong>g et al.<br />
(2006)<br />
Noncod<strong>in</strong>g RNAs http://rfam.sanger.ac.uk Griffiths-Jones, et al.<br />
(2005)<br />
rRNAs http://www.cbs.dtu.dk/services/RNAmmer Lagesen et al. (2007)<br />
Genome Atlas http://www.cbs.dtu.dk/services/GenomeAtlas Hall<strong>in</strong> <strong>and</strong> Ussery<br />
(2004)<br />
BLAST Atlas (zoomable) http://www.cbs.dtu.dk/services/gwBrowser<br />
UPDATE!<br />
‘‘Genome Properties’’ http://cmr.tigr.org/tigr-scripts/CMR/shared/<br />
GenomePropertiesHomePage.cgi<br />
Hall<strong>in</strong> <strong>and</strong> Ussery<br />
(2004)<br />
Selengut et al.<br />
(2007)<br />
4315<br />
expressed as a numerical value, such as length, %GC, number of genes, etc. Such plots show<br />
the spread of the data <strong>and</strong> are made as follows: the values are sorted <strong>and</strong> divided <strong>in</strong>to two equal<br />
parts, separated by the median, which is marked as a bar <strong>in</strong> the middle of the distribution. A<br />
box is drawn to cover the range where the middle 50% of the data are (exclud<strong>in</strong>g the first 25%<br />
<strong>and</strong> the last 25% of the data). The ‘‘whiskers’’ are the hatched l<strong>in</strong>es, connect<strong>in</strong>g the lowest (left)<br />
<strong>and</strong> highest (right) values, with the exception of outlier po<strong>in</strong>ts, which are shown as <strong>in</strong>dividual<br />
dots. Outliers are def<strong>in</strong>ed as data that are distant by more than 1.5 times the range of the box.<br />
The base composition of genomes, i.e., their %GC content (or %AT which together make<br />
100%), can also be compared, as shown <strong>in</strong> > Fig. 1b. The GC content of a genome can range<br />
from 17% <strong>in</strong> C. ruddii to 75% GC <strong>in</strong> Anaeromyxobacter dehalogenans. The smallest genome is<br />
also the most AT rich, <strong>and</strong> many of the larger genomes are quite GC rich. It is not clear if there<br />
is a biological force <strong>in</strong> play beh<strong>in</strong>d this correlation, although it has been observed that the<br />
ecological niche an organism occupies roughly correlates to both genome size <strong>and</strong> GC content<br />
(Foerstner et al., 2005, Musto et al., 2006).<br />
In addition to the average GC content for a whole genome, local variation with<strong>in</strong> a given<br />
genome can be exam<strong>in</strong>ed, <strong>and</strong> this reveals two general trends for almost all bacterial genomes.<br />
First, on a more global, chromosomal level a large region flank<strong>in</strong>g the orig<strong>in</strong> of DNA