29.07.2013 Views

Computational tools and Interoperability in Comparative ... - CBS

Computational tools and Interoperability in Comparative ... - CBS

Computational tools and Interoperability in Comparative ... - CBS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 3 A list of all stra<strong>in</strong>s/sample names <strong>and</strong> their accession numbers used <strong>in</strong> the metagenomic comparison. The list is sorted by sampl<strong>in</strong>g depth<br />

Source Size Orig<strong>in</strong> Accession/sample Ref. Depth<br />

P. mar<strong>in</strong>us str. MIT 9515 1 704 176 (1906 prote<strong>in</strong>s) Tropical Pacific CP000552 Unpublished Surface<br />

P. mar<strong>in</strong>us str. MIT 9215 1 738 790 (1983 prote<strong>in</strong>s) Equatorial Pacific CP000825 Unpublished Surface<br />

P. mar<strong>in</strong>us str. MED4 1 657 990 (1936 prote<strong>in</strong>s) Mediterranean Sea BX548174 21 4 m<br />

JGI_SMPL_HF10_10-07-02 7 482 668 (7842 contigs) North Pacific Subtropical Gyre — 12 10 m<br />

P. mar<strong>in</strong>us str. NATL1A 1 864 731 (2193 prote<strong>in</strong>s) North Atlantic CP000553 Unpublished 30 m<br />

P. mar<strong>in</strong>us str. NATL2A 1 842 899 (2163 prote<strong>in</strong>s) North Atlantic CP000095 Unpublished 30 m<br />

P. mar<strong>in</strong>us str. AS9601 1 669 886 (1921 prote<strong>in</strong>s) Arabian Sea CP000551 Unpublished 50 m<br />

JGI_SMPL_HF70_10-07-02 10 828 386 (10 999 contigs) North Pacific Subtropical Gyre — 12 70 m<br />

P. mar<strong>in</strong>us str. MIT 9211 1 688 963 (1855 prote<strong>in</strong>s) Equatorial Pacific CP000878 21 83 m<br />

P. mar<strong>in</strong>us str. MIT 9301 1 641 879 (1907 prote<strong>in</strong>s) Sargasso Sea CP000576 Unpublished 90 m<br />

P. mar<strong>in</strong>us str. MIT 9303 2 682 675 (2997 prote<strong>in</strong>s) Sargasso Sea CP000554 Unpublished 100 m<br />

P. mar<strong>in</strong>us str. SS120 1 751 080 (1882 prote<strong>in</strong>s) Sargasso Sea AE017126 22 120 m<br />

JGI_SMPL_HF130_10-06-02 6 091 784 (6812 contigs) North Pacific Subtropical Gyre — 12 130 m<br />

P. mar<strong>in</strong>us str. MIT 9312 1 709 204 (1962 prote<strong>in</strong>s) Equatorial Pacific CP000111 Unpublished 135 m<br />

P. mar<strong>in</strong>us str. MIT MIT9313 2 410 873 (2273 prote<strong>in</strong>s) Gulf Stream BX548175 21 135 m<br />

JGI_SMPL_HF200_10-06-02 7 829 659 (8286 contigs) North Pacific Subtropical Gyre — 12 200 m<br />

JGI_SMPL_HF500_10-06-02 8 764 642 (9027 contigs) North Pacific Subtropical Gyre — 12 500 m<br />

JGI_SMPL_HF770_12-21-03 11 811 597 (11 479 contigs) North Pacific Subtropical Gyre — 12 770 m<br />

JGI_SMPL_HF4000_12-21-03 11 028 821 (11 229 contigs) North Pacific Subtropical Gyre — 12 4000 m<br />

currently available sequences (2.7 Mb)<br />

<strong>and</strong> was therefore used as reference <strong>in</strong><br />

this comparison. BLAST hits between<br />

the reference <strong>and</strong> the encoded prote<strong>in</strong>s<br />

of all the P. mar<strong>in</strong>us genomes <strong>in</strong>cluded<br />

were generated with the BLASTp<br />

algorithm, whereas hits between the<br />

reference prote<strong>in</strong>s <strong>and</strong> the DNA reads<br />

of the metagenomic samples were gener-<br />

ated us<strong>in</strong>g the tBLASTn algorithm.<br />

tBLASTn was used to avoid the<br />

gene prediction step of the metagenomic<br />

samples <strong>and</strong> to allow a rough estimate<br />

of the cod<strong>in</strong>g potential of these samples.<br />

All lanes are sorted accord<strong>in</strong>g to<br />

the water depth at which the samples<br />

were collected (see Fig. 6). The Perl<br />

code for construct<strong>in</strong>g this plot us<strong>in</strong>g<br />

web services is provided on the service<br />

homepage.<br />

Discussion<br />

The BLASTatlas method can assist biologists<br />

<strong>in</strong> f<strong>in</strong>d<strong>in</strong>g regions along the chromosome<br />

which are conserved (or not).<br />

This <strong>in</strong>formation is useful for several<br />

Fig. 6 BLASTatlas show<strong>in</strong>g fully sequenced Prochlorococcus genomes (green) <strong>and</strong> the seven ALOHA metagenomic samples (blue). Outermost<br />

lanes represent samples closer to the ocean surface.<br />

This journal is c The Royal Society of Chemistry 2008 Mol. BioSyst., 2008, 4, 363–371 | 369

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!