11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

56 4. MIMARKS<br />

Doug Wendel, Owen White, Andrew Whiteley, Andreas Wilke, Jennifer<br />

R Wortman, Tanya Yatsunenko and Frank Oliver Glöckner<br />

Submitted to: nature biotechnology, accepted April 2011<br />

Personal Contribution: Initial talk with the title: “Survey results:<br />

MInimal list of contextual data fields for ENvironmental Sequences<br />

(MIENS)” at the 6th meet<strong>in</strong>g of the GSC at the EBI (H<strong>in</strong>xton, UK)<br />

October 2008, which was the start<strong>in</strong>g po<strong>in</strong>t for the development of this<br />

standard, that was later renamed to MIMARKS. Contributed suggestions<br />

for improvements of the data fields, dur<strong>in</strong>g implementation work<br />

of this standard <strong>in</strong> the tools MetaBar and CD<strong>in</strong>Fusion.<br />

Relevance: Standards development for contextual data.<br />

4.1 Abstract<br />

Here we present a standard developed by the Genomic Standards Consortium<br />

(GSC) for report<strong>in</strong>g marker gene sequences—the m<strong>in</strong>imum<br />

<strong>in</strong>formation about a marker gene sequence (MIMARKS). We also <strong>in</strong>troduce<br />

a system for describ<strong>in</strong>g the environment from which a biological<br />

sample orig<strong>in</strong>ates. The ‘environmental packages’ apply to any<br />

genome sequence of known orig<strong>in</strong> and can be used <strong>in</strong> comb<strong>in</strong>ation<br />

with MIMARKS and other GSC checklists. F<strong>in</strong>ally, to establish a<br />

unified standard for describ<strong>in</strong>g sequence data and to provide a s<strong>in</strong>gle<br />

po<strong>in</strong>t of entry for the scientific community to access and learn about<br />

GSC checklists, we present the m<strong>in</strong>imum <strong>in</strong>formation about any (x)<br />

sequence (MIxS). Adoption of MIxS will enhance our ability to analyze<br />

natural genetic diversity documented by massive DNA sequenc<strong>in</strong>g<br />

efforts from myriad ecosystems <strong>in</strong> our ever-chang<strong>in</strong>g biosphere.<br />

4.2 Introduction<br />

Without specific guidel<strong>in</strong>es, most genomic, metagenomic and marker<br />

gene sequences <strong>in</strong> databases are sparsely annotated with the <strong>in</strong>formation<br />

required to guide data <strong><strong>in</strong>tegration</strong>, comparative studies and knowl-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!