different applications, such as identify<strong>in</strong>g phage <strong>in</strong>sertion sites <strong>and</strong> loss of important genetic material. This method is even able to scale down to each <strong>in</strong>dividual nucleotide or am<strong>in</strong>o acid residue. However, it is unable to deal with sequences (or parts thereof) that are not found <strong>in</strong> the reference genome. A good compromise when deal<strong>in</strong>g with this issue is often to use the largest chromosome of a species as reference; <strong>in</strong> addition, it can be useful to rebuild the maps us<strong>in</strong>g different reference genomes. Besides this limitation, the fact that all coord<strong>in</strong>ates are mapped back to the reference causes the coord<strong>in</strong>ates of the database genomes to ‘‘get lost’’ <strong>in</strong> that only the best match is displayed, regardless of the chromosomal location <strong>in</strong> the database genomes. Other aspects of genome homology like gene synteny cannot effectively be answered by this tool. However, it is possible to use an additional circle to plot gene order conservation along the chromosome. Currently, we see the BLASTatlas as an <strong>in</strong>termediate stage <strong>in</strong> analysis of many genomes of similar species. Soon there will be a need to compare hundreds or thous<strong>and</strong>s of genome sequences, <strong>and</strong> the need for development of new methods for comparison of even larger numbers of genomes (hundreds or thous<strong>and</strong>s) is ever more important. Acknowledgements The authors would like to thank Hans Henrik Stærfeld for assistance with server side programs <strong>and</strong> Kristoffer Rapacki for assistance on web services data types. The work was supported by a grant from the European Union through the EMBRACE network of Excellence, contract number LSHG-CT- 2004-512092 <strong>and</strong> a grant from the Danish Center for Scientific Comput<strong>in</strong>g (DCSC). References 1 R. D. Fleischmann, M. D. Adams, O. White, R. A. Clayton, E. F. Kirkness, A. R. Kerlavage, C. J. Bult, J. F. Tomb, B. A. Dougherty, J. M. Merrick, J. McKenney, G. Sutton, W. FitzHugh, C. Fields, J. D. Gocyne, J. Scott, R. Shirley, L. I. Liu, A. Glodek, J. M. Kelley, J. F. Weidman, C. A. Phillips, T. Spriggs, E. Hedblom, M. D. Cotton, T. R. Utterback, M. C. Hanna, D. T. Nguyen, D. M. Saudek, R. C. Br<strong>and</strong>on, L. D. F<strong>in</strong>e, J. L. Fritchman, J. L. Fuhrmann, N. S. M. Geoghagen, C. L. Gnehm, L. A. McDonald, K. V. Small, C. M. Fraser, H. O. Smith <strong>and</strong> J. C. Venter, Whole-Genome R<strong>and</strong>om Sequenc<strong>in</strong>g <strong>and</strong> Assembly of Haemophilus Influenzae Rd., Science, 1995, 269(5223), 496–512. 2 L. J. Jensen, M. Skovgaard, T. Sicheritz- Ponten, M. K. Jorgensen, C. Lundegaard, C. C. Pedersen, N. Petersen <strong>and</strong> D. Ussery, Analysis of two large functionally uncharacterized regions <strong>in</strong> the Methanopyrus k<strong>and</strong>leri AV19 genome, BMC Genomics, 2003, 4, 12. 3 L. J. Jensen, M. Skovgaard, T. Sicheritz- Ponten, N. T. Hansen, H. Johansson, M. K. Jørgensen, K. Kiil, P. F. Hall<strong>in</strong> <strong>and</strong> D. Ussery, <strong>Comparative</strong> genomics of four Pseudomonas species, <strong>in</strong> The Pseudomonads Vol. I. Genomics, Life Style <strong>and</strong> Molecular Architecture, ed. J. L. Ramos, Kluwer Academic/Plenum Publishers, New York, 2004, ch. 5, pp. 139–164. 4 P. F. Hall<strong>in</strong>, T. T. B<strong>in</strong>newies <strong>and</strong> D. W. Ussery, Genome update: chromosome atlases, Microbiology (Read<strong>in</strong>g, U. K.), 2004, 150, 3091–3093. 5 T. J. Carver, K. M. Rutherford, M. Berriman, M. A. Raj<strong>and</strong>ream, B. G. Barrell <strong>and</strong> J. Parkhill, ACT: the Artemis Comparison Tool, Bio<strong>in</strong>formatics, 2005, 21, 3422–3423. 6 M. Sebaihia, M. W. Peck, N. P. M<strong>in</strong>ton, N. R. Thomson, M. T. Holden, W. J. Mitchell, A. T. Carter, S. D. Bentley, D. R. Mason, L. Crossman, C. J. Paul, A. Ivens, M. H. Wells-Bennik, I. J. Davis, A. M. Cerdeno-Tarraga, C. Churcher, M. A. Quail, T. Chill<strong>in</strong>gworth, T. Feltwell, A. Fraser, I. Goodhead, Z. Hance, K. Jagels, N. Larke, M. Maddison, S. Moule, K. Mungall, H. Norbertczak, E. Rabb<strong>in</strong>owitsch, M. S<strong>and</strong>ers, M. Simmonds, B. White, S. Whithead <strong>and</strong> J. Parkhill, Genome sequence of a proteolytic (Group I) Clostridium botul<strong>in</strong>um stra<strong>in</strong> Hall A <strong>and</strong> comparative analysis of the clostridial genomes, Genome Res., 2007, 17, 1082–1092. 7 A. G. Pedersen, L. J. Jensen, S. Brunak, H. H. Staerfeldt <strong>and</strong> D. W. Ussery, A DNA structural atlas for Escherichia coli, J. Mol. Biol., 2000, 299, 907–930. 8 E. S. Shpigelman, E. N. Trifonov <strong>and</strong> Bolshoy, A Curvature: software for the analysis of curved DNA, CABIOS, Comput. Appl. Biosci., 1993, 9, 435–440. 9 M. Skovgaard, L. J. Jensen, C. Friis, H. H. Stærfeldt, P. Worn<strong>in</strong>g, S. Brunak <strong>and</strong> D. Ussery, The Atlas Visualisation of Genome-wide Information, Methods Microbiol., 2002, 33, 49–63. 10 L. J. Jensen, C. Friis <strong>and</strong> D. W. Ussery, Three Views of Microbial Genomes, Res. Microbiol., 1999, 150, 773–777. 11 M. B. Sullivan, M. L. Coleman, P. Weigele, F. Rohwer <strong>and</strong> S. W. Chisholm, Three Prochlorococcus cyanophage Genomes: Signature Features <strong>and</strong> Ecological Interpretations, PLoS Biol., 2005, 3, e144; PMID: 15828858 [PubMed—<strong>in</strong>dexed for MEDLINE]. 12 E. F. DeLong, C. M. Preston, T. M<strong>in</strong>cer, V. Rich, S. J. Hallam, N.-U. Frigaard, A. Mart<strong>in</strong>ez, M. B. Sullivan, R. Edwards, B. R. Brito, S. W. Chisholm <strong>and</strong> D. M. Karl, Community Genomics Among Stratified Microbial Assemblages <strong>in</strong> the Ocean’s Interior, Science, 2006, 311(5760), 496–503. 13 D. L. Wheeler, T. Barrett, D. A. Benson, S. H. Bryant, K. Canese, V. Chetvern<strong>in</strong>, D. M. Church, M. DiCuccio, R. Edgar, S. Federhen, L. Y. Geer, Y. Kapust<strong>in</strong>, O. Khovayko, D. L<strong>and</strong>sman, D. J. Lipman, T. L. Madden, D. R. Maglott, J. Ostell, V. Miller, K. D. Pruitt, G. D. Schuler, E. Sequeira, S. T. Sherry, K. Sirotk<strong>in</strong>, A. Souvorov, G. Starchenko, R. L. Tatusov, T. A. Tatusova, L. Wagner <strong>and</strong> E. Yaschenko, Database Resources of the National Center for Biotechnology Information, Nucleic Acids Res., 2007, 35, D5–D12. 14 J. Noll<strong>in</strong>g, G. Breton, M. V. Omelchenko, K. S. Makarova, Q. Zeng, R. Gibson, H. M. Lee, J. Dubois, D. Qiu, J. Hitti, Y. I. Wolf, R. L. Tatusov, F. Sabathe, L. Doucette-Stamm, P. Soucaille, M. J. Daly, G. N. Bennett, E. V. Koon<strong>in</strong> <strong>and</strong> D. R. Smith, Genome Sequence <strong>and</strong> <strong>Comparative</strong> Analysis of the Solvent-produc<strong>in</strong>g Bacterium Clostridium acetobutylicum, J. Bacteriol., 2001, 183, 4823–4838. 15 M. Sebaihia, B. W. Wren, P. Mullany, N. F. Fairweather, N. M<strong>in</strong>ton, R. Stabler, N. R. Thomson, A. P. Roberts, A. M. Cerdeno-Tarraga, H. Wang, M. T. Holden, A. Wright, C. Churcher, M. A. Quail, S. Baker, N. Bason, K. Brooks, T. Chill<strong>in</strong>gworth, A. Cron<strong>in</strong>, P. Davis, L. Dowd, A. Fraser, T. Feltwell, Z. Hance, S. Holroyd, K. Jagels, S. Moule, K. Mungall, C. Price, E. Rabb<strong>in</strong>owitsch, S. Sharp, M. Simmonds, K. Stevens, L. Unw<strong>in</strong>, S. Whithead, B. Dupuy, G. Dougan, B. Barrell <strong>and</strong> J. Parkhill, The Multidrug-resistant Human Pathogen Clostridium difficile has a Highly Mobile: Mosaic Genome, Nat. Genet., 2006, 38, 779–786. 16 C. Bettegowda, X. Huang, J. L<strong>in</strong>, I. Cheong, M. Kohli, S. A. Szabo, X. Zhang, L. A. Diaz, Jr, V. E. Velculescu, G. Parmigiani, K. W. K<strong>in</strong>zler, B. Vogelste<strong>in</strong> <strong>and</strong> S. Zhou, The Genome <strong>and</strong> Transcriptomes of the Anti-tumor Agent Clostridiumnovyi-NT, Nat. Biotechnol., 2006, 24, 1573–1580. 17 G. S. Myers, D. A. Rasko, J. K. Cheung, J. Ravel, R. Seshadri, R. T. DeBoy, Q. Ren, J. Varga, M. M. Awad, L. M. Br<strong>in</strong>kac, S. C. Daugherty, D. H. Haft, R. J. Dodson, R. Madupu, W. C. Nelson, N. J. Rosovitz, S. A. Sullivan, H. Khouri, G. I. Dimitrov, K. L. Watk<strong>in</strong>s, S. Mulligan, J. Benton, D. Radune, D. J. Fisher, H. S. Atk<strong>in</strong>s, T. Hiscox, B. H. Jost, S. J. Bill<strong>in</strong>gton, J. G. Songer, B. A. McClane, R. W. Titball, J. I. Rood, S. B. Melville <strong>and</strong> I. T. Paulsen, Skewed Genomic Variability <strong>in</strong> Stra<strong>in</strong>s of the Toxigenic Bacterial Pathogen, Clostridium perfr<strong>in</strong>gens, Genome Res., 2006, 16, 1031–1040. 18 T. Shimizu, K. Ohtani, H. Hirakawa, K. Ohshima, A. Yamashita, T. Shiba, N. Ogasawara, M. Hattori, S. Kuhara <strong>and</strong> H. Hayashi, Complete Genome Sequence of Clostridium perfr<strong>in</strong>gens, an Anaerobic Flesh-eater, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 996–1001. 19 H. Bruggemann, S. Baumer, W. F. Fricke, A. Wiezer, H. Liesegang, I. Decker, 370 | Mol. BioSyst., 2008, 4, 363–371 This journal is c The Royal Society of Chemistry 2008
C. Herzberg, R. Mart<strong>in</strong>ez-Arias, R. Merkl, A. Henne <strong>and</strong> G. Gottschalk, The Genome Sequence of Clostridium tetani, the Causative Agent of Tetanus Disease, Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 1316–1321. 20 Y. Sakaguchi, T. Hayashi, K. Kurokawa, K. Nakayama, K. Oshima, Y. Fuj<strong>in</strong>aga, M. Ohnishi, E. Ohtsubo, M. Hattori <strong>and</strong> K. Oguma, The Genome Sequence of Clostridium botul<strong>in</strong>um Type C Neurotox<strong>in</strong> Convert<strong>in</strong>g Phage <strong>and</strong> the Molecular Mechanisms of Unstable Lysogeny, Proc. Natl. Acad. Sci. U. S. A.,2005,102,17472–17477. 21 G. Rocap, F. W. Larimer, J. Lamerd<strong>in</strong>, S. Malfatti, P. Cha<strong>in</strong>, N. A. Ahlgren, A. Arellano, M. Coleman, L. Hauser, W. R. Hess, Z. I. Johnson, M. L<strong>and</strong>, D. L<strong>in</strong>dell, A. F. Post, W. Regala, M. Shah, S. L. Shaw, C. Steglich, M. B. Sullivan, C. S. T<strong>in</strong>g, A. Tolonen, E. A. Webb, E. R. Z<strong>in</strong>ser <strong>and</strong> S. W. Chisholm, Genome Divergence <strong>in</strong> Two Prochlorococcus ecotypes Reflects Oceanic Niche Differentiation, Nature, 2003, 424, 1042–1047. 22 A. Dufresne, M. Salanoubat, F. Partensky, F. Artiguenave, I. M. Axmann, V. Barbe, S. Duprat, M. Y. Galper<strong>in</strong>, E. V. Koon<strong>in</strong>, F. Le Gall, K. S. Makarova, M. Ostrowski, S. Oztas, C. Robert, I. B. Rogoz<strong>in</strong>, D. J. Scanlan, N. T<strong>and</strong>eau de Marsac, J. Weissenbach, P. W<strong>in</strong>cker, Y. I. Wolf <strong>and</strong> W. R. Hess, Genome Sequence of the Cyanobacterium Prochlorococcus mar<strong>in</strong>us SS120, a Nearly M<strong>in</strong>imal Oxyphototrophic Genome, Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 9647–9649. 23 C. J. Benham <strong>and</strong> C. Bi, The Analysis of Stress-<strong>in</strong>duced Duplex Destabilization <strong>in</strong> Long Genomic DNA Sequences, J. Comput. Biol., 2004, 11, 519–543. 24 K. Liolios, N. Tavernarakis, P. Hugenholtz <strong>and</strong> N. C. Kyrpides, The Genomes On L<strong>in</strong>e Database (GOLD) v.2: a monitor of genome projects worldwide, Nucleic Acids Res., 2006, 34, D332–D334. 25 J. I. Rood <strong>and</strong> S. T. Cole, Molecular genetics <strong>and</strong> pathogenesis of Clostridium perfr<strong>in</strong>gens, Microbiol. Rev., 1991, 55, 621–648. This journal is c The Royal Society of Chemistry 2008 Mol. BioSyst., 2008, 4, 363–371 | 371
- Page 1 and 2:
Peter Fischer Hallin | 2009 Peter F
- Page 4:
Preface This Ph.D. thesis is writte
- Page 7 and 8:
thesis, the work is just being publ
- Page 9 and 10: ved at blive publiceret i Standards
- Page 11 and 12: viii
- Page 13 and 14: Paper VI [Lagesen K, Hallin P] 1 ,
- Page 15 and 16: xii
- Page 17 and 18: 3.3.3 Refining E. coli and Shigella
- Page 19 and 20: xvi
- Page 21 and 22: xviii 2.17 Pan- and core-genome plo
- Page 24 and 25: Chapter 1 Introduction Introduction
- Page 26 and 27: Chapter 2 Comparative Genomics 2.1
- Page 28 and 29: Comparative Genomics the publicly a
- Page 30 and 31: Comparative Genomics source CDS tot
- Page 32 and 33: Comparative Genomics 1 mysql -N -B
- Page 34 and 35: Listing 2.8: R code to generate a 2
- Page 36 and 37: 1st U C A G U 2nd position C A G 3r
- Page 38 and 39: 1st U C A G U 2nd position C A G 3r
- Page 40 and 41: Escherichia coli strain K-12, subst
- Page 42 and 43: Comparative Genomics
- Page 44 and 45: 3M 2.5M 3.5M 2.5M 2M 0M 2M 0.5M B.
- Page 46 and 47: Streptococcus Escherichia Bacillus
- Page 48 and 49: 2.4 Summary Comparative Genomics Th
- Page 50 and 51: Comparative Genomics 2.5 Instant in
- Page 52 and 53: ‘ReSourCe is he best online submi
- Page 54 and 55: up to a total of 41 different E. co
- Page 56 and 57: Fig. 2 Genes (or segments) from eac
- Page 58 and 59: Fig. 5 BLASTatlas of Clostridium bo
- Page 62 and 63: 1 Comparative Genomics 2.7 Paper II
- Page 64 and 65: 166 literally millions of bacterial
- Page 66 and 67: 168
- Page 68 and 69: 170 resistance genes on mobile gene
- Page 70 and 71: 172 involved in generating diversit
- Page 72 and 73: 174 recipient DNA. A feature observ
- Page 74 and 75: 176 Fig. 5 Genome length distributi
- Page 76 and 77: 178
- Page 78 and 79: 180 reasons why organisms remain un
- Page 80 and 81: 182 A final problem has to do with
- Page 82 and 83: 184 Middendorf B, Hochhut B, Leipol
- Page 84 and 85: 1 Comparative Genomics 2.8 Paper II
- Page 86 and 87: 2 O. N. Reva et al. Fig. 1. Genome
- Page 88 and 89: 4 O. N. Reva et al. decrease of the
- Page 90 and 91: 6 O. N. Reva et al. of which are kn
- Page 92 and 93: 8 O. N. Reva et al. compiled into a
- Page 94 and 95: 10 O. N. Reva et al. encoded by a c
- Page 96 and 97: 12 O. N. Reva et al. systems and ef
- Page 98 and 99: 1 2.9 Paper IV: The origins of Vibr
- Page 100 and 101: phylogenies based on alternative ho
- Page 102 and 103: Figure 1 Phylogenetic tree of the 1
- Page 104 and 105: 25000 20000 15000 10000 5000 0 Pan
- Page 106 and 107: Gap F 2M 2.5M Gap E 875k 750k 625k
- Page 108 and 109: Table 2 A selection of genes locate
- Page 110 and 111:
Open Access This article is distrib
- Page 112 and 113:
1 Comparative Genomics 2.10 Paper V
- Page 114 and 115:
4314 74 Tools Abstract: Of the plet
- Page 116 and 117:
4316 74 Tools Size distribution of
- Page 118 and 119:
4318 74 Tools Genome atlas Intrinsi
- Page 120 and 121:
4320 74 Tools Genome atlas Intrinsi
- Page 122 and 123:
4322 74 Tools for Comparison of Bac
- Page 124 and 125:
4324 74 Tools for Comparison of Bac
- Page 126 and 127:
4326 74 Tools information, as genet
- Page 128 and 129:
Chapter 3 rRNA operons and promoter
- Page 130 and 131:
tuB murI Fis III Fis II Fis I UP -3
- Page 132 and 133:
Bits 2.0 1.5 1.0 0.5 0.0 Bits 2.0 1
- Page 134 and 135:
RNA operons and promoter analysis O
- Page 136 and 137:
Bits 2.0 1.5 1.0 0.5 0.0 Bits T A T
- Page 138 and 139:
Code Meaning Example C Coding CCCCC
- Page 140 and 141:
RNA operons and promoter analysis 3
- Page 142 and 143:
P2 -10 -35 UP P1 -10 -35 UP FIS FIS
- Page 144 and 145:
1 rRNA operons and promoter analysi
- Page 146 and 147:
Using HMMs also simplifies the use
- Page 148 and 149:
Information content Information con
- Page 150 and 151:
of the annotation. Some of the majo
- Page 152 and 153:
where match states stop around 10 c
- Page 154 and 155:
1 rRNA operons and promoter analysi
- Page 156 and 157:
synthesis in flow cells to simultan
- Page 158 and 159:
Read absence. A boolean where ‘on
- Page 160 and 161:
Hallin, et al. Figure 4 | The dataf
- Page 162 and 163:
Genome homology: Comparing multiple
- Page 164 and 165:
ing platform-‐independent Java
- Page 166 and 167:
34. Wang H, Noordewier M, Benham CJ
- Page 168 and 169:
Chapter 4 Web Services and Interope
- Page 170 and 171:
Web Services and Interoperability i
- Page 172 and 173:
Web Services and Interoperability i
- Page 174 and 175:
Web Services and Interoperability i
- Page 176 and 177:
Web Services and Interoperability i
- Page 178 and 179:
Chapter 5 Conclusion and perspectiv
- Page 180 and 181:
Appendix A Appendix: Workshops, tea
- Page 182 and 183:
Appendix B Appendix: Ph.D. study pl
- Page 184 and 185:
Danmarks Tekniske Universitet AFI,
- Page 186 and 187:
Danmarks Tekniske Universitet AFI,
- Page 188 and 189:
Appendix C Appendix: Courses C.1 Gl
- Page 190 and 191:
D.2 Sample output from queryGenomes
- Page 192 and 193:
Appendix: Software 13 w a r n " $ o
- Page 194 and 195:
Appendix: Software 109 m y ( $ m i
- Page 196 and 197:
Appendix: Software 25 [ ] A l t e r
- Page 198 and 199:
BIBLIOGRAPHY J. Rogers, P. F. Stadl
- Page 200 and 201:
BIBLIOGRAPHY Q. Jin, Z. Yuan, J. Xu
- Page 202 and 203:
BIBLIOGRAPHY Velicer, F.-J. Vorholt