12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

11 <strong>Function</strong> Predictions of Structural Genomics Results 279determine how successful the structure-based methods in ProFunc were at predictingfunction. These 93 proteins of known function were submitted <strong>to</strong> the ProFuncserver and the <strong>to</strong>p scoring matches from each method retrieved and s<strong>to</strong>red. Theresults were then backdated <strong>to</strong> the deposition date for each structure <strong>to</strong> ensure thatwhat was being measured was how successful the server would have been had itbeen fully operational from the start. The resultant <strong>to</strong>p hit for each method was thenmanually compared <strong>with</strong> the known function for each protein and a judgementmade as <strong>to</strong> whether the prediction was correct.The results from the study indicated that, of the methods available as part of theProFunc server, the fold recognition and “reverse template” approaches were themost successful <strong>with</strong> approximately 60% of the known functions identified correctly.Detailed investigation revealed that both of these methods often identify thesame function by matching <strong>to</strong> the same protein but cases could be found where onemethod succeeded where the other failed. This is due <strong>to</strong> the fact that the fold matchingis looking at the global similarity in compared proteins whereas the “reversetemplate” approach is a very local comparison. One major drawback identified inthe study was its inability <strong>to</strong> address the question of how structure-based approachescompare against sequence-based approaches. However, this is a generic problemwhich has not been adequately addressed in the literature due <strong>to</strong> the immense difficultiesinvolved in accurately rolling back the sequence databases, as well as patternsand profiles derived from them, <strong>to</strong> a particular date.In addition <strong>to</strong> the general comparison of methods Watson et al. (2007) presentedsome specific examples of function predictions, sometimes verified. An example ofsuccessful structure-based function prediction was the BioH protein fromEscherichia coli (Sanishvili et al. 2003). This protein was known <strong>to</strong> be involved inbiotin synthesis but no biochemical function had been assigned <strong>to</strong> it. Analysis ofthe structure using ProFunc returned a highly significant match (<strong>with</strong> an RMSD of0.28 Å) <strong>to</strong> an enzyme active-site template for the Ser-His-Asp catalytic triad of thelipases. Fold comparison using DALI indicated structural similarity <strong>to</strong> a largenumber of proteins <strong>with</strong> a variety of enzymatic functions although the sequenceidentity of these hits was low, ranging from 15–25%. The closest matches includeda bromoperoxidase (EC 1.11.1.10), an aminopeptidase (EC 3.4.11.5), two epoxidehydrolases (EC 3.3.2.3), two haloalkane dehalogenases (EC 3.8.1.5), and a lyase(EC 4.2.1.39). Only through extensive manual analysis of these enzymes and a literaturereview would it have become obvious that each of these contain a Ser-His-Aspcatalytic triad in their active sites. The enzyme active-site template search identifiedthe presence of the catalytic triad instantly. Experimental characterisation of thisprotein revealed it <strong>to</strong> be a novel carboxylesterase acting on short acyl chainsubstrates (Sanishvili et al. 2003).A further example illustrating how functional clues can be derived from thestructure involves a hypothetical protein (IsdG) from Staphylococcus aureus.Sequence-based analysis by methods in ProFunc revealed a variety of functions,including antibiotic biosynthesis monooxygenase, cysteine peptidase, oxidoreductase,methyltransferase, epimerase, transportation, possible RNA binding, andothers. When the structure was examined using the MSDfold/SSM service, the <strong>to</strong>p

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!