12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Estimating Protein Function Using Protein–Protein Relationships 117For each protein i and its highest scoring match in a genome j, E ijrepresentsthe BLAST expectation value of the match and p ijrepresents the transformationof E ij, such thatpij1=−logEIntroduction of logarithm-induced artifacts during this transformation isavoided by truncating values of p ij> 1 to 1. When encoded in a computer program,the result of this procedure is vector of N p ijvalues, where N is the numberof completely sequenced genomes included in the reference database. Usersare free to experiment with other ways of transforming BLAST E-values. Thefollowing is a real-life example of the phylogenetic profile vector for the P. falciparumprotein PFA0110w, created by comparing the query sequence againsta database of 163 completely sequenced genomes.>pfalciparum|Pfa3D7|pfal_chr1|PFA0110w|Annotation|Sanger1.000 1.000 1.000 1.000 1.000 0.072 0.076 0.068 1.0001.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.0671.000 1.000 0.079 0.070 0.082 1.000 1.000 0.072 0.0700.070 0.070 0.079 0.086 1.000 1.000 1.000 0.072 0.0721.000 1.000 1.000 0.084 0.080 0.076 0.069 1.000 1.0001.000 1.000 0.072 0.061 0.082 0.072 0.072 0.072 0.0720.059 1.000 0.061 0.084 0.071 0.080 0.080 0.080 0.0800.072 0.080 0.072 1.000 1.000 0.080 0.070 1.000 0.0840.084 0.087 0.062 1.000 0.084 0.086 1.000 1.000 0.0580.062 1.000 1.000 1.000 0.072 0.068 0.069 1.000 1.0000.054 1.000 1.000 0.076 0.062 1.000 1.000 1.000 1.0000.086 0.082 1.000 0.079 0.065 1.000 1.000 0.087 1.0000.076 0.058 1.000 1.000 0.068 0.068 0.068 1.000 1.0000.068 0.080 0.080 1.000 1.000 0.072 1.000 1.000 1.0001.000 1.000 1.000 0.079 0.079 0.079 1.000 1.000 1.0001.000 1.000 0.067 0.079 1.000 1.000 0.076 0.070 1.0001.000 1.000 0.079 0.061 0.067 1.000 1.000 1.000 1.0001.000 1.000 0.024 0.072 0.061 1.000 0.069 0.0620.000*0.072 0.076Scale1.000 → 0.000(complete absence) (confident presence)In this example, underlined scores represent archaeal genomes, whereas scoresin italics represent eukaryotic genomes, and “*” indicates the transformedij

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!