11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

92 7. Summary and discussion<br />

http://mixs.gensc.org/report/1 is a good entry po<strong>in</strong>t to suggest and<br />

discuss improvements ’bottom-up’.<br />

7.3 Megx.net: A unified view on the data<br />

The megx.net platform with its on-l<strong>in</strong>e Genes Mapserver <strong>in</strong>terface to<br />

access georeferenced sequence data, provides users with a unified view<br />

on the contextual and sequence data. The platform exemplifies the immediate<br />

benefits of data <strong><strong>in</strong>tegration</strong>. Usage statistics show that especially<br />

the Geographic-Basic Local Alignment and Search Tool (BLAST)<br />

service, was on average accessed 750 times per month from March<br />

2010 until March 2011. When users get hits for their search sequence,<br />

through l<strong>in</strong>k<strong>in</strong>g to SILVA and ultimately the INSDC entries, they can<br />

access a lot of additional <strong>in</strong>formation. Also, a download of the complete<br />

list of hits is possible. This shows, that the data <strong><strong>in</strong>tegration</strong><br />

efforts <strong>in</strong> this direction are useful to the community. Furthermore,<br />

the <strong><strong>in</strong>tegration</strong> of the environmental parameters such as temperature,<br />

nitrate, phosphate, sal<strong>in</strong>ity, silicate, dissolved oxygen, oxygen saturation,<br />

oxygen utilization, chlorophyll and environmental stability <strong>in</strong> the<br />

Genes Mapservers facilitates sequence data analysis <strong>in</strong> an environmental<br />

context.<br />

The <strong><strong>in</strong>tegration</strong> of sequence data and environmental parameters by us<strong>in</strong>g<br />

the (x, y, z, t)-key-data tuple has proven to be very useful. This<br />

key-data-tuple truly is a m<strong>in</strong>imal contextual data set that helps to l<strong>in</strong>k<br />

sequence data to many other data sources. The megx.net platform<br />

successfully demonstrates how the <strong>in</strong>terpretability of sequence data is<br />

<strong>in</strong>creased through data <strong><strong>in</strong>tegration</strong>.<br />

The data sources that are currently <strong>in</strong>tegrated <strong>in</strong> megx.net are however<br />

only ”the tip of the iceberg”. The Global Ocean Survey (GOS)<br />

data set [Rusch et al., 2007] was the first big mar<strong>in</strong>e metagenome.<br />

It is reasonable to expect many more. It will require a lot of effort<br />

to <strong>in</strong>tegrate all these upcom<strong>in</strong>g data sets. Also, the World Ocean<br />

Atlas (WOA) 2009 has been released. Megx.net is still us<strong>in</strong>g the previous<br />

WOA 2005. Further scientific environmental data resources like<br />

PANGAEA http://www.pangaea.de/ could be not only l<strong>in</strong>ked, but <strong>in</strong>-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!