Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
Data integration in microbial genomics ... - Jacobs University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
92 7. Summary and discussion<br />
http://mixs.gensc.org/report/1 is a good entry po<strong>in</strong>t to suggest and<br />
discuss improvements ’bottom-up’.<br />
7.3 Megx.net: A unified view on the data<br />
The megx.net platform with its on-l<strong>in</strong>e Genes Mapserver <strong>in</strong>terface to<br />
access georeferenced sequence data, provides users with a unified view<br />
on the contextual and sequence data. The platform exemplifies the immediate<br />
benefits of data <strong><strong>in</strong>tegration</strong>. Usage statistics show that especially<br />
the Geographic-Basic Local Alignment and Search Tool (BLAST)<br />
service, was on average accessed 750 times per month from March<br />
2010 until March 2011. When users get hits for their search sequence,<br />
through l<strong>in</strong>k<strong>in</strong>g to SILVA and ultimately the INSDC entries, they can<br />
access a lot of additional <strong>in</strong>formation. Also, a download of the complete<br />
list of hits is possible. This shows, that the data <strong><strong>in</strong>tegration</strong><br />
efforts <strong>in</strong> this direction are useful to the community. Furthermore,<br />
the <strong><strong>in</strong>tegration</strong> of the environmental parameters such as temperature,<br />
nitrate, phosphate, sal<strong>in</strong>ity, silicate, dissolved oxygen, oxygen saturation,<br />
oxygen utilization, chlorophyll and environmental stability <strong>in</strong> the<br />
Genes Mapservers facilitates sequence data analysis <strong>in</strong> an environmental<br />
context.<br />
The <strong><strong>in</strong>tegration</strong> of sequence data and environmental parameters by us<strong>in</strong>g<br />
the (x, y, z, t)-key-data tuple has proven to be very useful. This<br />
key-data-tuple truly is a m<strong>in</strong>imal contextual data set that helps to l<strong>in</strong>k<br />
sequence data to many other data sources. The megx.net platform<br />
successfully demonstrates how the <strong>in</strong>terpretability of sequence data is<br />
<strong>in</strong>creased through data <strong><strong>in</strong>tegration</strong>.<br />
The data sources that are currently <strong>in</strong>tegrated <strong>in</strong> megx.net are however<br />
only ”the tip of the iceberg”. The Global Ocean Survey (GOS)<br />
data set [Rusch et al., 2007] was the first big mar<strong>in</strong>e metagenome.<br />
It is reasonable to expect many more. It will require a lot of effort<br />
to <strong>in</strong>tegrate all these upcom<strong>in</strong>g data sets. Also, the World Ocean<br />
Atlas (WOA) 2009 has been released. Megx.net is still us<strong>in</strong>g the previous<br />
WOA 2005. Further scientific environmental data resources like<br />
PANGAEA http://www.pangaea.de/ could be not only l<strong>in</strong>ked, but <strong>in</strong>-