11.03.2014 Views

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

Data integration in microbial genomics ... - Jacobs University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.3. Results 45<br />

sequence obta<strong>in</strong>ed from seawater. Subsequently the web forms were<br />

Figure 3.2: CD<strong>in</strong>Fusion web user <strong>in</strong>terface. The CD are entered <strong>in</strong>to the auto-generated web<br />

forms. Details about each parameter are accessible with the “more <strong>in</strong>fo” l<strong>in</strong>k. These details<br />

are retrieved us<strong>in</strong>g a web service access<strong>in</strong>g the GSC database and are therefore always up to<br />

date.<br />

filled with all the CD available for this particular sequence (example<br />

Figure 3.2). After generat<strong>in</strong>g and download<strong>in</strong>g the output file, the CD<br />

enriched FASTA was imported <strong>in</strong>to Sequ<strong>in</strong> version 11.00. CD<strong>in</strong>Fusion<br />

<strong>in</strong>serted qualifiers specified by GenBank <strong>in</strong>to the header l<strong>in</strong>e of the<br />

FASTA file. The tool placed the rest of the CD <strong>in</strong>to a tab delimited<br />

structured comment file. This file was loaded <strong>in</strong>to Sequ<strong>in</strong> with the<br />

“Advanced Table Readers” option <strong>in</strong> the “Annotate” menu. The CD<br />

appeared <strong>in</strong> the metadata section between the header and the feature<br />

table section. By select<strong>in</strong>g “Done”, the Sequ<strong>in</strong> file was saved and the<br />

complete submission was prepared. The INSDC database entry for<br />

this submission can be accessed at [Accession number: JF681370].<br />

This use case exemplifies submission scenarios, where a s<strong>in</strong>gle sequence<br />

and its CD are to be submitted to the INSDC databases. S<strong>in</strong>gle sequences<br />

can, for example, be marker genes or genomes that consist of<br />

a s<strong>in</strong>gle sequence or contig.<br />

In the second use case, a permanent draft genome from a Rhodopirellula<br />

baltica stra<strong>in</strong> along with its associated CD was prepared for submission.<br />

After the 6.9 Mb MultiFASTA file was uploaded, the user was

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!