12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

10 Integrated Servers for <strong>Structure</strong>-Informed <strong>Function</strong> Prediction 255Fig. 10.1 Schematic diagram of the sequence- and structure-based methods applied <strong>to</strong> any protein3D structure submitted <strong>to</strong> the ProKnow function prediction server. The sequence-based methodsare PSI-BLAST (Altschul et al. 1997) and PROSITE (Hulo et al. 2004). The structure-basedmethods are the Dali fold search (Holm and Sander 1998) and RIGOR structural motif search(Kleywegt 1999). The last two methods use DIP, the Database of Interacting <strong>Protein</strong>s (Xenarioset al. 2002) and the Prolinks Database (Bowers et al. 2004) <strong>to</strong> identify any interesting functionallinkages for each of the PSI-BLAST hits. The Gene On<strong>to</strong>logy (GO) functional annotations areobtained from all the results and combined using Bayesian weighting <strong>to</strong> arrive at a set of functionalprediction and associated reliability estimatesa cheat as ProKnow requires the user <strong>to</strong> first run the Dali fold-matching program(Holm and Sander 1998) before uploading the results, in FSSP format, <strong>to</strong> ProKnow.The matches obtained from Dali provide the first set of clues used by ProKnowabout the protein’s function.Curiously, if only the sequence is submitted <strong>to</strong> ProKnow, then ProKnow doesall the work itself: it identifies a fold compatible <strong>with</strong> the sequence and uses thatfor clues about function. To identify the most likely fold it uses the UCLA foldrecognitionserver. This has a several-step strategy. First it tries <strong>to</strong> match thesequence <strong>to</strong> PDB entries using BLAST. It then tries an iterative PSI-BLAST. Ifboth fail, it uses a prediction of the protein’s secondary structure, obtained fromthe PSIPRED server at University College London (Bryson et al. 2005). This predictionis fed in<strong>to</strong> the Sequence Derived Properties (SDP) program (Fischer andEisenberg 1997) which tries <strong>to</strong> find a suitable fold. Finally, if even this gives nothing,a method called Directional A<strong>to</strong>mic Solvation EnergY (DASEY) (Mallick etal. 2002) is applied.One thing that needs <strong>to</strong> be remembered when relying on the results of any foldrecognition,or threading, method is that these methods are something of a BlackArt, and require careful interpretation Occasionally, they can give approximatelythe right answer – usually for small, single-domain proteins where a <strong>to</strong>pologicallynear-correct model is obtained (Moult 2005); but, in general, accuracy varieswidely. If the sequence is a very long one the chances of success are even smaller

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!