12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

126 Dateor contracts put forth by the sequencing centers, and ask permission from the principalinvestigators before using the data.4. In the author’s experience, gene/protein identifiers are used in a sequence-specificcontext by the sequencing centers, meaning two or more sequences can possessthe same identifiers. Users should ensure that all identifiers are unique, to preventerrors in the results.5. It is always advisable to use a low E-value cutoff to avoid including possible falsepositives. For some genomes, especially those of higher eukaryotes, higher E-valuecutoffs might be required to capture accurate sequence matches. An empirical surveyof published literature reveals that authors usually trust and accept sequencematches with E-values of 10 –5 .6. Identification of correct functional links using fusion sequences is greatly affectedby the presence of certain “promiscuous” domains (such as ATP-binding cassettedomains). If possible, sequences with such domains should be identified and discardedduring analysis, or placed in a separate low-confidence group. Other criteriafor enhancing the value of the match can also serve to strengthen results. Forinstance, accepted matches that contain a high percentage of identical aminoacids will certainly increase confidence in the results. Using a strong BLAST E-value cutoff can also prove beneficial in many instances (see Note 5).References1. Gardner, M. J., Hall, N., Funq, E., et al. (2002) Genome sequence of the humanmalaria parasite Plasmodium falciparum. Nature 419, 498–511.2. Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D., and Yeates, T. O.(1999) Assigning protein functions by comparative genome analysis: protein phylogeneticprofiles. Proc. Natl. Acad. Sci. USA 13, 4285–4288.3. Gaasterland, T. and Ragan, M. A. (1998) Microbial genescapes: phyletic andfunctional patterns of ORF distribution among prokaryotes Microb. Comp.Genomics 3, 199–217.4. Marcotte, E. M., Pellegrini, M., Ng, H. -L., Rice, D. W., Yeates, T. O., andEisenberg, D. (1999) Detecting Protein Function and Protein-Protein Interactionsfrom Genome Sequences. Science 285, 751–753.5. Date, S. V. and Marcotte, E. M. (2003) Discovery of uncharacterized cellular systemsby genome-wide analysis of functional linkages. Nat. Biotechnol. 21, 1055–1062.6. Butland, G., Peregrin-Alvarez, J. M., Li, J., et al. (2005) Interaction NetworkContaining Conserved and Essential Protein Complexes in Escherichia coli.Nature 433, 531–537.7. Peregrin-Alvarez, J. M., Tsoka, S., and Ouzounis, C. A. (2003) The phylogeneticextent of metabolic enzymes and pathways. Genome Res. 13, 422–427.8. Date, S. V. and Stoeckert, C. J. (2006) Computational modeling of thePlasmodium falciparum interactome reveals protein function on a genome-widescale. Genome Res. 4, 542–549.9. Lee, I., Date, S. V., Adai, A. T., and Marcotte, E. M. (2004) A probabilistic functionalnetwork of yeast genes. Science 306, 1555–1558.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!