Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
98 8. Conclusion and outlook<br />
nities to automatically process the data.<br />
• Standards: The development, implementation of standards for<br />
contextual data have the potential to create a wide impact and<br />
to facilitate broad use of the data. This has to be done <strong>in</strong> a<br />
community effort and requires the adoption of the standards.<br />
• Knowledge: F<strong>in</strong>ally, efficient strategies to ga<strong>in</strong> knowledge which<br />
make use of the available contextual data, need to be developed.<br />
An example for that has been given <strong>in</strong> chapter 6.<br />
8.2 Projects on the horizon<br />
The GOS project started the age of large-scale environmental metagenomic<br />
data sets.<br />
There are many follow-up projects with specific focuses on the horizon.<br />
The TARA ocean cruise (http://oceans.taraexpeditions.org/) and the<br />
Malasp<strong>in</strong>a project (http://www.expedicionmalasp<strong>in</strong>a.es/Malasp<strong>in</strong>a/Ma<strong>in</strong>.<br />
do) are further explor<strong>in</strong>g the oceans’ ecosystems. The human microbiome<br />
project (http://commonfund.nih.gov/hmp/) is <strong>in</strong>vestigat<strong>in</strong>g the<br />
<strong>microbial</strong> diversity <strong>in</strong> humans. The Earth microbiome project (http:<br />
//www.earthmicrobiome.org/) aims to comprehensively characterize the<br />
global <strong>microbial</strong> taxonomic and functional diversity.<br />
There is no doubt that these projects will create vast amounts of sequence<br />
data. With the MIxS standards <strong>in</strong> place, it can be expected<br />
that contextual data are recorded and publicly deposited along with<br />
these sequence data. Once contextualized, a dense network of data<br />
po<strong>in</strong>ts will be created (schematically depicted as overlapp<strong>in</strong>g data<br />
clouds about “Organisms“, “Genes“ and the “Environment“ <strong>in</strong> figure<br />
8.1). The denser this network becomes and the better these data are<br />
<strong>in</strong>tegrated through the usage of contextual data, the greater will be the<br />
scope of analysis possibilities. In data analyses it will become easier to<br />
dist<strong>in</strong>guish signal from noise. More and more statistically mean<strong>in</strong>gful<br />
signals will be detected <strong>in</strong> the ever grow<strong>in</strong>g <strong>in</strong>tegrated data set. In<br />
the years to come, the spotlight needs to be on the development of<br />
methods and strategies to detect these signals. To draw a hypothetical<br />
picture about where the contextual and sequence data <strong><strong>in</strong>tegration</strong>