12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 Fold Recognition 472.4 Alignment Accuracy, Model Qualityand Statistical SignificanceFold recognition can be divided in<strong>to</strong> two problems (1) detection of an appropriatetemplate and (2) alignment <strong>to</strong> that template. Clearly any useful method fortemplate-based protein structure prediction must at minimum be able <strong>to</strong> detectappropriate templates. However, regardless of the quality of template detection,the quality of the alignment completely determines the quality of the resultingmodel. Errors in this alignment, despite using a good template, will still lead <strong>to</strong>poor models.Until this point, <strong>with</strong> the exception of some of the SVM classifiers discussed inSection 3.3, we have assumed that a system capable of accurate detection of templateswill also generate an accurate alignment. Although often a reasonableassumption, there will be many cases where this does not hold. Firstly, most of thesystems described essentially rank templates by some alignment score. In otherwords, there will always be a ‘<strong>to</strong>p scoring model’. Simply because one alignmentis better scoring than all the others does not imply the alignment is error-free.Secondly, most of these methods are attempting <strong>to</strong> detect extremely remote signalsof homology. As such, they may simply detect certain conserved motifs, or patchesof similarity, interrupted by long stretches of sequence where similarity is undetectableand thus the alignment is essentially ‘noise’. This in turn leads <strong>to</strong> large errorsin the resulting three-dimensional model.For these reasons many groups have investigated techniques <strong>to</strong> improve alignmentaccuracy and predict the quality of the resulting model. There are three waysin which one can tackle the problem of alignment accuracy: (1) improve the algorithmfor alignment generation directly (2) generate many alignments and developa system <strong>to</strong> pick the best one and (3) build 3D models from many alignments andassess the resulting models.2.4.1 Algorithms for Alignment Generation and AssessmentWe have already seen how using evolutionary information in the form of profiles orHMMs and predicted secondary structure improves homologue detection and thisis usually accompanied by a similar improvement in alignment accuracy (Elofsson2002). Zachariah et al. (2005) demonstrated that a more elaborate model for gapinitiation and extension during alignment by dynamic programming did notimprove homologue detection, but did increase alignment accuracy significantly.A successful approach in recent CASPs has been that of Venclovas andMargelevicius (2005). In this procedure, a set of sequences that bridge sequencespace between the query sequence and template(s) are used <strong>to</strong> initiate additionalPSI-BLAST searches against the nonredundant sequence database. Query–templatesequence alignments are then extracted from search results and their

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!