Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
68 5. Megx.net<br />
ronmental parameters can be calculated. Sequence entries have been<br />
curated to comply with the proposed m<strong>in</strong>imal standards for genomes<br />
and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium.<br />
Access to data is facilitated by Web Services. The updated<br />
megx.net portal offers <strong>microbial</strong> ecologists greatly enhanced database<br />
content, and new features and tools for data analysis, all of which are<br />
freely accessible from our webpage http://www.megx.net.<br />
5.2 Introduction<br />
Over the last years, molecular biology has undergone a paradigm shift,<br />
mov<strong>in</strong>g from a s<strong>in</strong>gle experiment science to a high-throughput endeavour.<br />
Although the genomic revolution is rooted <strong>in</strong> medic<strong>in</strong>e and<br />
biotechnology, it is currently the environmental sector, specifically the<br />
mar<strong>in</strong>e, which delivers the greatest quantity of data. Mar<strong>in</strong>e ecosystems,<br />
cover<strong>in</strong>g >70% of the Earth’s surface, host the majority of<br />
biomass and significantly contribute to global organic matter and energy<br />
cycl<strong>in</strong>g. Micro-organisms are known to be the ‘gatekeepers’ of<br />
these processes and <strong>in</strong>sights <strong>in</strong>to their lifestyle and fitness will enhance<br />
our ability to monitor, model and predict future changes.<br />
Recent developments <strong>in</strong> sequenc<strong>in</strong>g technology have made rout<strong>in</strong>e sequenc<strong>in</strong>g<br />
of whole <strong>microbial</strong> communities from natural environments<br />
possible. Prom<strong>in</strong>ent examples <strong>in</strong> the mar<strong>in</strong>e field are the ongo<strong>in</strong>g<br />
Global Ocean Sampl<strong>in</strong>g (GOS) campaign [Venter et al., 2004, Rusch<br />
et al., 2007] and Gordon and Betty Moore Foundation Mar<strong>in</strong>e Microbial<br />
Genome Sequenc<strong>in</strong>g Project (http://www.moore.org/microgenome/).<br />
Notably, the GOS resulted <strong>in</strong> a major <strong>in</strong>put of new sequence data with<br />
unprecedented functional diversity [Yooseph et al., 2007]. The result<strong>in</strong>g<br />
flood of sequence data available <strong>in</strong> public databases is an extraord<strong>in</strong>ary<br />
resource with which to explore <strong>microbial</strong> diversity and metabolic<br />
functions at the molecular level.<br />
These large-scale sequenc<strong>in</strong>g projects br<strong>in</strong>g new challenges to data<br />
management and software tools for assembly, gene prediction and annotation—fundamental<br />
steps <strong>in</strong> genomic analysis. Several new dedicated<br />
database resources have recently emerged to tackle the current<br />
need for large-scale metagenomic data management, namely CAM-