12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

34 L.A. Kelleyijf (l)=kf(i, j,k,l)Mijkxxand fk(l) is the relative frequency of all pairs separated by k residues in distanceclass l:fxxk(l)=R∑ ∑ j∑ ∑RtR Ri ∑ jf(i, j,k,l)NKf(i, j,k,l)Here R is the number of residue types and N is the number of sequence separationclasses. The pair-potential for a given protein is the sum of the energies for all residue-residuecontacts <strong>with</strong>in the given separation parameters.There are many sources of variation in the detail of how such potentials are calculated.For example, a force-field may simply be based on the distances betweenalpha carbons of the backbone which may suffice for relatively crude recognitionof the gross <strong>to</strong>pology of a structure. One could add more a<strong>to</strong>m-based interactionsites, possibly <strong>to</strong> better account for hydrogen-bonding. The framework of theBoltzmann relation is not limited <strong>to</strong> distances. One may add in angular dependence,or the packing angle between beta-strands. A force-field may have different contributionsfrom residues separated by different distances along the sequence: i.e. onemay use different functions for residues close in sequence (i,i + 3) and those furtherapart (i,i + n; n > 10) as mentioned above.Clearly the power of a threading approach is essentially encapsulated in thepower of the energy function. As a result much past and current research focuseson the development of ever more elaborate, and hopefully more powerful, empiricalpotentials.2.2.2 Finding an AlignmentGiven a potential function that can assign a score <strong>to</strong> a given protein model structure,one is faced <strong>with</strong> a difficult task: finding the alignment of a sequence on<strong>to</strong> a structurethat minimises (or maximises) that potential function. If one were <strong>to</strong> ignore the factthat insertions and deletions in sequence occur in evolution, then one couldimplement a ‘gapless threading’ approach. This involves simply sliding the sequencethrough the structure, sampling every gapless alignment and assigning it a score.This has the advantage of being computationally fast, yet suffers severely from disallowinggaps. An insertion or deletion of just one residue would cause a frameshiftthat would prevent the detection of an otherwise excellent alignment. So permittinggaps is crucial <strong>to</strong> take in<strong>to</strong> account the nature of evolutionary variation.However, it is the allowance of these same gaps that turns a trivial problem in<strong>to</strong>an NP-hard problem for which no fast (polynomial time) solution is possible.Exhaustive enumeration of all possible gapped alignments of a sequence and

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!