12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

254 R.A. Laskowskiucla.edu, and ProFunc from the European <strong>Bioinformatics</strong> Institute (EBI) at http://www.ebi.ac.uk/profunc. Both use sequence-based as well as structure-based predictionsand are largely au<strong>to</strong>mated: one uploads a PDB-format file and waitspatiently for the results.To illustrate the two methods we use a fairly recently solved 3D structure as anexample. It is the structure of a putative acetyltransferase from Vibrio choleraesolved in 2005 by the Midwest Center for Structural Genomics (MCSG). It wasreleased by the PDB as entry 2fck on 28 February 2006 (Cuff et al. 2007). Thefunction of this protein was only tentatively known at the time its structure wasbeing solved; its sequence had over 50% identity <strong>to</strong> a ribosomal-protein-serineacetyltransferase and contained several sequence motifs characteristic of acetyltransferaseactivity. Once its structure was known, these tentative functional assignmentswere greatly strengthened as it revealed strong structural similarities, bothglobal and local, <strong>to</strong> other – distantly related – acetyltransferases. The strongestsimilarities occurred at the putative binding site where coenzyme A (coA) is likely<strong>to</strong> bind. Some of these similarities will be illustrated below.10.2 ProKnowThe first of the two integrated servers described here is ProKnow (Pal andEisenberg 2005) at UCLA (http://proknow.mbi.ucla.edu). The current version,ProKnow 2.0, runs six principal prediction methods on any uploaded 3D structure(Fig. 10.1). In fact, the server can also accept just a protein sequence; in whichcase, one of the six methods is dropped. The six features examined include theprotein’s overall fold, various 3D structural motifs (omitted for sequence-onlysubmissions), sequence similarities, sequence motifs, and functional linkagesfrom the Database of Interacting <strong>Protein</strong>s (DIP) and the Prolinks Database. Eachmethod may provide one or more functional clues, <strong>with</strong> varying degrees of confidence.These clues are weighted using Bayes’ theorem and combined <strong>to</strong> give themost likely overall function, expressed as GO terms and measures of confidencefor each. A map showing the relationship between the <strong>to</strong>p GO predictions isreturned (Fig. 10.2), allowing the user <strong>to</strong> more confidently interpret the predictions.Also given are the detailed hits and their scores. The <strong>to</strong>p results for ourexample structure, 2fck, are shown in Fig. 10.3. Here, essentially only one hit ofsignificance was returned: N-acetyltransferase, which is very confidently predictedand agrees <strong>with</strong> the protein’s putative function.10.2.1 Fold MatchingThe first stage in ProKnow is the identification of other protein structures havingthe same, or most similar, fold <strong>to</strong> that of the query protein. This actually is a bit of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!