12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

38 L.A. Kelley(Lathrop 1999) of the program was developed that returns a good approximationquickly, but can return ever better results the longer it is permitted <strong>to</strong> run, eventuallyreturning a global optimum alignment.Yet another closely related approach is protein threading by linear programming(Xu et al. 2003). Linear programming is a general technique <strong>to</strong> solve complexproblems given a set of ‘constraints’. In the case of threading, these constraintsoften involve the idea that one section of a query sequence aligned <strong>to</strong> a structureimplies that downstream parts of the sequence must also be aligned <strong>to</strong> downstreamparts of the structure (and the equivalent constraint on upstream parts). These typesof constraint, being logical rather than continuous variables can be cast as an integerprogramming problem. Such problems are often solved by ‘relaxing’ the integerproblem <strong>to</strong> a continuous linear programming problem followed by the use of abranch-and-bound method.This overview of some of the techniques that have been applied <strong>to</strong> threadingillustrates how a diverse set of <strong>to</strong>ols developed in physics, mathematics and computerscience have all been focused on this one difficult problem over the last 15–20years, yet no single method has demonstrated dominance in the field. Despite havingvery powerful techniques <strong>to</strong> optimally align a sequence <strong>to</strong> a structure given anenergy function, it appears that it is the energy function itself where most of theweakness lies in terms of practical performance.2.3 Remote Homology Detection Without ThreadingThe threading approach was originally devised <strong>to</strong> tackle the issue of detecting thecompatibility of a sequence <strong>with</strong> a known structure. The finite number of folds innature indicated that, given a decent energy function and alignment algorithm, suchapproaches would succeed where sequence-based approaches would fail. Thesequence-based approaches require there <strong>to</strong> be some detectable sequence homologybetween a sequence of interest and a known structure, whereas the threading techniquesin theory required none.The early days of searching a database of sequences for potential homologueswas dominated by BLAST and other similar approaches. They were based on theidea of using a generic scoring function such as the BLOSUM or PAM matriceswhich provide a probability of a mutational transition between one amino acid typeand another based on a set of confidently aligned blocks of similar proteinsequences. These were simple 20 × 20 lookup tables that gave a score for a matchbetween any pair of amino acids in an alignment. Thus, in general, good scoreswould be awarded for aligning a hydrophobic residue <strong>to</strong> another hydrophobic residue(leucine aligned <strong>to</strong> valine for example) and poor scores were awarded formatching dissimilar residues (glutamate and tryp<strong>to</strong>phan for example). Combiningthis scoring function <strong>with</strong> a standard dynamic programming algorithm permittedmodest performance in detecting homologous relationships. If one were <strong>to</strong> searcha database of sequences <strong>with</strong> known structures, and subsequently build a model

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!