07.02.2013 Views

Bioinformatics Algorithms: Techniques and Applications

Bioinformatics Algorithms: Techniques and Applications

Bioinformatics Algorithms: Techniques and Applications

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

subject to �<br />

Dij ∈Pmn<br />

xij ≥ 1, where Pmn ∈ I.<br />

ACKNOWLEDGMENTS 487<br />

To account for the noise in the experimental data, a set of linear programs is<br />

constructed in a probabilistic fashion, where the probability of including an LP constraint<br />

in Equation 21.12 equals the probability with which the corresponding protein–<br />

protein interaction is assumed to be correct. The LP-score for a domain pair Dij is<br />

then averaged over all LP programs. An additional r<strong>and</strong>omization experiment is used<br />

to compute p-values <strong>and</strong> prevent overprediction of interactions between frequently<br />

occurring domain pairs. Guimaraes at al. [26] demonstrated that the PE method outperforms<br />

the EM <strong>and</strong> DPEA methods (Fig. 21.11).<br />

GLOSSARY<br />

Coevolution Coordinated evolution. It is generally agreed that proteins that interact<br />

with each other or have similar function undergo coordinated evolution.<br />

Gene fusion A pair of genes in one genome is fused together into a single gene in<br />

another genome.<br />

HMMer HMMer is a freely distributable implementation of profile HMM (hidden<br />

Markov model) software for protein sequence analysis. It uses profile HMMs to do<br />

sensitive database searching using statistical descriptions of a sequence family’s<br />

consensus.<br />

iPfam iPfam is a resource that describes domain–domain interactions that are observed<br />

in PDB crystal structures.<br />

Ortholog Two genes from two different species are said to be orthologs if they<br />

evolved directly from a single gene in the last common ancestor.<br />

PDB The protein data bank (PDB) is a central repository for 3D structural data of<br />

proteins <strong>and</strong> nucleic acids. The data, typically obtained by X-ray crystallography<br />

or NMR spectroscopy, are submitted by biologists <strong>and</strong> biochemists from around<br />

the world, released into the public domain, <strong>and</strong> can be accessed for free.<br />

Pfam Pfam is a large collection of multiple sequence alignments <strong>and</strong> hidden Markov<br />

models covering many common protein domains <strong>and</strong> families.<br />

Phylogenetic profile A phylogenetic profile for a protein is a vector of 1s <strong>and</strong> 0s<br />

representing the presence or absence of that protein in a reference set organisms.<br />

Distance matrix A matrix containing the evolutionary distances of organisms or<br />

proteins in a family.<br />

ACKNOWLEDGMENTS<br />

This work was funded by the intramural research program of the National Library of<br />

Medicine, National Institutes of Health.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!