11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

32 2. MetaBar<br />

the “MIENS culture (miens c)” and “metagenome (me)” packages, respectively,<br />

to record data specific to their study type (Figure 4). PM1<br />

and PM2 receive their genome and metagenome sequences as FASTA<br />

files with automatically generated sequence identifiers <strong>in</strong> the header.<br />

The researchers enter these identifiers <strong>in</strong>to the “seqID” field <strong>in</strong> the acquisition<br />

spreadsheet and export the data to a format for submission<br />

to INSDC. With this mapp<strong>in</strong>g, these contextual datasets can easily be<br />

comb<strong>in</strong>ed with one or more FASTA sequences us<strong>in</strong>g a suitable submission<br />

tool. The researchers then submit their metadata-enriched sequences<br />

from the EXAMPLE project to an INSDC database. MetaBar<br />

implements a neat trade-off between universality and specificity. The<br />

export functions assure that the collected data can be publicly stored<br />

and shared with the scientific community.<br />

Comparison of MetaBar and Handlebar<br />

The idea of uniquely identify<strong>in</strong>g samples and stor<strong>in</strong>g data about these<br />

samples <strong>in</strong> databases is not new and is widely used <strong>in</strong> many applications<br />

and discipl<strong>in</strong>es. However, tools able to capture the contextual<br />

data of environmental samples comb<strong>in</strong>ed with barcode label<strong>in</strong>g are<br />

rare. To our knowledge with the exception of MetaBar, the only open<br />

source tool us<strong>in</strong>g barcod<strong>in</strong>g to identify georeferenced samples from the<br />

environment is Handlebar [Booth et al., 2007]. A tabular comparison<br />

of the programs’ general features can be found <strong>in</strong> Table 2.1.<br />

HandleBar, as a lightweight LIMS, not only covers contextual data that<br />

are recorded dur<strong>in</strong>g sampl<strong>in</strong>g, but also aims to document subsequent<br />

sample process<strong>in</strong>g steps <strong>in</strong> the laboratory. In this respect, MetaBar is<br />

a simplification focus<strong>in</strong>g only on the capture of contextual data <strong>in</strong> the<br />

field. MetaBar does not seek to replace well established laboratory<br />

bookkeep<strong>in</strong>g or professional LIM systems, but rather aims to complement<br />

this process to ensure that contextual data are electronically<br />

accessible. Nevertheless, users may choose to use the tool as a storage<br />

<strong>in</strong>ventory manager or to store <strong>in</strong>termediate results of sample process<strong>in</strong>g<br />

because it is possible to store additional data <strong>in</strong> the spreadsheets.<br />

It is important to note that coupl<strong>in</strong>g contextual data with sequence<br />

data before submission to the INSDC databases is a unique feature of<br />

MetaBar.<br />

The barcodes are, by concept, solely used to l<strong>in</strong>k environmental sam-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!