11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

68 5. Megx.net<br />

ronmental parameters can be calculated. Sequence entries have been<br />

curated to comply with the proposed m<strong>in</strong>imal standards for genomes<br />

and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium.<br />

Access to data is facilitated by Web Services. The updated<br />

megx.net portal offers <strong>microbial</strong> ecologists greatly enhanced database<br />

content, and new features and tools for data analysis, all of which are<br />

freely accessible from our webpage http://www.megx.net.<br />

5.2 Introduction<br />

Over the last years, molecular biology has undergone a paradigm shift,<br />

mov<strong>in</strong>g from a s<strong>in</strong>gle experiment science to a high-throughput endeavour.<br />

Although the genomic revolution is rooted <strong>in</strong> medic<strong>in</strong>e and<br />

biotechnology, it is currently the environmental sector, specifically the<br />

mar<strong>in</strong>e, which delivers the greatest quantity of data. Mar<strong>in</strong>e ecosystems,<br />

cover<strong>in</strong>g >70% of the Earth’s surface, host the majority of<br />

biomass and significantly contribute to global organic matter and energy<br />

cycl<strong>in</strong>g. Micro-organisms are known to be the ‘gatekeepers’ of<br />

these processes and <strong>in</strong>sights <strong>in</strong>to their lifestyle and fitness will enhance<br />

our ability to monitor, model and predict future changes.<br />

Recent developments <strong>in</strong> sequenc<strong>in</strong>g technology have made rout<strong>in</strong>e sequenc<strong>in</strong>g<br />

of whole <strong>microbial</strong> communities from natural environments<br />

possible. Prom<strong>in</strong>ent examples <strong>in</strong> the mar<strong>in</strong>e field are the ongo<strong>in</strong>g<br />

Global Ocean Sampl<strong>in</strong>g (GOS) campaign [Venter et al., 2004, Rusch<br />

et al., 2007] and Gordon and Betty Moore Foundation Mar<strong>in</strong>e Microbial<br />

Genome Sequenc<strong>in</strong>g Project (http://www.moore.org/microgenome/).<br />

Notably, the GOS resulted <strong>in</strong> a major <strong>in</strong>put of new sequence data with<br />

unprecedented functional diversity [Yooseph et al., 2007]. The result<strong>in</strong>g<br />

flood of sequence data available <strong>in</strong> public databases is an extraord<strong>in</strong>ary<br />

resource with which to explore <strong>microbial</strong> diversity and metabolic<br />

functions at the molecular level.<br />

These large-scale sequenc<strong>in</strong>g projects br<strong>in</strong>g new challenges to data<br />

management and software tools for assembly, gene prediction and annotation—fundamental<br />

steps <strong>in</strong> genomic analysis. Several new dedicated<br />

database resources have recently emerged to tackle the current<br />

need for large-scale metagenomic data management, namely CAM-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!