12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 Fold Recognition 452.3.4 Consensus ApproachesRecent CASP experiments have demonstrated the dominance of consensus methodsthat combine the results of a number of fold recognition servers in<strong>to</strong> a singleprediction. These ‘meta-servers’ clearly outperform many of the individual methodsthey are built from such as those described above: sequence-profile alignment,HMMs, profile-profile alignment and threading.Some of the most popular techniques for combining multiple predictions inmeta-servers include Pcons (Wallner and Elofsson 2005), 3D-Shotgun (Fischer2003) and 3D-Jury (Ginalski et al. 2003). The simplest of these, yet still very powerful,is the 3D-Jury method. This involves comparing the three-dimensional modelsbuilt by the individual servers by structurally aligning them. It then re-ranks themodels based on their structural similarity <strong>to</strong> all the others in the pool. Thus if severalrelatively independent fold prediction systems have chosen similar templatesand generated similar alignments, then these will be ranked more highly than moreunusual models. The Pcons method combines this 3D-Jury methodology <strong>with</strong> aneural network trained <strong>to</strong> discriminate between models <strong>with</strong> and <strong>with</strong>out proteinlikefeatures (similar <strong>to</strong> an empirical energy function used in threading). Finally3D-Shotgun calculates a 3D-Jury score for each residue in each model, and assemblesa new model out of the most common or ‘popular’ pieces. This can lead <strong>to</strong>severe fragmentation in the model and, although some work has been geared<strong>to</strong>wards repairing such flaws, the problem remains.An extensive investigation of the source of the power of meta-servers was performedby (Bennett-Lovsey et al. 2008) which concluded that a large part of thisimprovement is not improved recall of remote homologues per se, but instead animprovement in precision, i.e. the elimination of false positives. This occursbecause when one combines many diverse structure prediction systems, the likelihoodthat they all make the same mistake is far smaller than the likelihood that theyagree. Any peculiarity in a sequence that may cause one or two prediction methods<strong>to</strong> fail is unlikely <strong>to</strong> have the same effect on the majority of methods. Combiningclassifiers or predictive algorithms in ensembles <strong>to</strong> improve performance is anestablished research area shared between statistical pattern recognition and machinelearning (Jain et al. 2000; Kuncheva and Whitaker 2003). Unfortunately, even afterseveral decades of research, the theoretical groundwork of ensemble theory doesnot yet provide us <strong>with</strong> a recipe for creating optimal ensembles. As a result, thework on meta-servers is generally founded on trial and error development.2.3.5 Traversing the Homology NetworkWe have seen <strong>with</strong> PSI-BLAST and intermediate sequence searching how combininga set of homologous relationships can lead <strong>to</strong> a rich and powerful search technique.Recent work has begun <strong>to</strong> explore this network of homologous relationships

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!