12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

10 Integrated Servers for <strong>Structure</strong>-Informed <strong>Function</strong> Prediction 259In general, ProKnow performs quite well. Its authors tested it on a nonredundantdata set of proteins of known function and found that around 70% of thefunctional annotations were correct (Pal and Eisenberg 2005). Less specific predictions(e.g. hydrolase) tended <strong>to</strong> be more accurate than more specific ones (e.g. leucylaminopeptidase). The prediction accuracy has been increased slightly by the recentinclusion of Prolinks, not present in the original version, and should improve moreas the coverage of Prolinks increases.10.3 ProFuncThe second integrated server described here is ProFunc (Laskowski et al. 2005b) at theEuropean <strong>Bioinformatics</strong> Institute (EBI), http://www.ebi.ac.uk/profunc, developed aspart of a collaboration <strong>with</strong> the Midwest Center for Structural Genomics (MCSG).ProFunc allows the user <strong>to</strong> either upload a protein structural model or <strong>to</strong> enter the PDBcode of a structure already in the <strong>Protein</strong> Data Bank. In the latter case, if ProFunc hasalready been run on that PDB entry, the results will be displayed immediately.When ProFunc runs it applies a number of sequence- and structure-based methods<strong>to</strong> the structure, as shown in Fig. 10.4. A processor farm is used, <strong>with</strong> differentmethods sent <strong>to</strong> different processors. Several of the compute intensive method arethemselves subdivided <strong>to</strong> run in parallel on multiple processors. Processing is usuallycomplete <strong>with</strong>in about an hour.The results of each method are then summarized, <strong>with</strong> further details availablefor each method. However, the results are not combined in any sophisticated way,as is done in ProKnow. Rather, there is a summary at the <strong>to</strong>p of the results pageshowing the most commonly occurring GO terms and protein names, but this ismeant only as a quick guide. The primary aim of the server is <strong>to</strong> present the resultsin an easily accessible format <strong>to</strong> enable researchers <strong>to</strong> interpret them, using theirown expertise and knowledge of the protein in question.Now, although ProFunc does apply a number of sequence-based methods, usingseveral well-known search techniques such as FASTA and InterProScan (Quevillonet al. 2005), we will only describe the structure-based methods here as most of themare unique <strong>to</strong> this server.10.3.1 ProFunc’s <strong>Structure</strong>-Based Methods10.3.1.1 Fold-MatchingThe first of the structure-based methods is a search against a representative subse<strong>to</strong>f the PDB for structures <strong>with</strong> the same, or similar, overall fold as the target. TheSSM (Secondary <strong>Structure</strong> Matching) program is used (Krissinel and Henrick2004). It performs a fast graph-matching procedure <strong>to</strong> compare the secondary struc-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!