11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

98 8. Conclusion and outlook<br />

nities to automatically process the data.<br />

• Standards: The development, implementation of standards for<br />

contextual data have the potential to create a wide impact and<br />

to facilitate broad use of the data. This has to be done <strong>in</strong> a<br />

community effort and requires the adoption of the standards.<br />

• Knowledge: F<strong>in</strong>ally, efficient strategies to ga<strong>in</strong> knowledge which<br />

make use of the available contextual data, need to be developed.<br />

An example for that has been given <strong>in</strong> chapter 6.<br />

8.2 Projects on the horizon<br />

The GOS project started the age of large-scale environmental metagenomic<br />

data sets.<br />

There are many follow-up projects with specific focuses on the horizon.<br />

The TARA ocean cruise (http://oceans.taraexpeditions.org/) and the<br />

Malasp<strong>in</strong>a project (http://www.expedicionmalasp<strong>in</strong>a.es/Malasp<strong>in</strong>a/Ma<strong>in</strong>.<br />

do) are further explor<strong>in</strong>g the oceans’ ecosystems. The human microbiome<br />

project (http://commonfund.nih.gov/hmp/) is <strong>in</strong>vestigat<strong>in</strong>g the<br />

<strong>microbial</strong> diversity <strong>in</strong> humans. The Earth microbiome project (http:<br />

//www.earthmicrobiome.org/) aims to comprehensively characterize the<br />

global <strong>microbial</strong> taxonomic and functional diversity.<br />

There is no doubt that these projects will create vast amounts of sequence<br />

data. With the MIxS standards <strong>in</strong> place, it can be expected<br />

that contextual data are recorded and publicly deposited along with<br />

these sequence data. Once contextualized, a dense network of data<br />

po<strong>in</strong>ts will be created (schematically depicted as overlapp<strong>in</strong>g data<br />

clouds about “Organisms“, “Genes“ and the “Environment“ <strong>in</strong> figure<br />

8.1). The denser this network becomes and the better these data are<br />

<strong>in</strong>tegrated through the usage of contextual data, the greater will be the<br />

scope of analysis possibilities. In data analyses it will become easier to<br />

dist<strong>in</strong>guish signal from noise. More and more statistically mean<strong>in</strong>gful<br />

signals will be detected <strong>in</strong> the ever grow<strong>in</strong>g <strong>in</strong>tegrated data set. In<br />

the years to come, the spotlight needs to be on the development of<br />

methods and strategies to detect these signals. To draw a hypothetical<br />

picture about where the contextual and sequence data <strong><strong>in</strong>tegration</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!