12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

206 E.C. Meng et al.The Common Structural Cliques method was proposed for identifying functionallyrelevant a<strong>to</strong>ms in structures that share a function but are evolutionarily unrelatedor distantly related (Milik et al. 2003). Each protein is reduced <strong>to</strong> a graph thatincludes only representative a<strong>to</strong>ms from each side chain. Sets of four a<strong>to</strong>ms areextracted and compared <strong>to</strong> identify common structural cliques, or sets <strong>with</strong> equivalentintera<strong>to</strong>mic distances and a<strong>to</strong>m types in both structures. These are merged in<strong>to</strong>larger sets of corresponding a<strong>to</strong>ms. Notably, the resulting 3D motifs can includedifferent weights for different intera<strong>to</strong>mic distances, even weights of zero. Thisallows matching even when certain distances vary significantly due <strong>to</strong> flexibility.For example, a motif could include a<strong>to</strong>ms in domains on either side of a hinge. Lowor zero weights for interdomain distances allow a motif <strong>to</strong> recognize structures <strong>with</strong>different hinge conformations, while weights for intradomain distances can be kepthigh <strong>to</strong> specify precise geometric relationships <strong>with</strong>in each domain. A limitation isthat results from pairwise comparisons are not combined au<strong>to</strong>matically.DRESPAT (Detection of REcurring Sidechain PATterns) extracts a shared motiffrom a set of positive example structures (Wangikar et al. 2003). Each protein isreduced <strong>to</strong> a graph of one functional a<strong>to</strong>m per residue, excluding residues <strong>with</strong>nonpolar side chains and disulphide-bonded cysteines. Patterns of three or moreresidues are extracted and compared <strong>to</strong> patterns <strong>with</strong> the same residue types fromthe other structures, considering alpha- and beta-carbons in addition <strong>to</strong> the functionala<strong>to</strong>ms and discarding matches <strong>with</strong> distance deviations and/or RMSD valuesgreater than specified cu<strong>to</strong>ffs. Other adjustable parameters are the pattern size(default three <strong>to</strong> six residues) and how many of the input structures must containthe pattern. Based on pattern occurrence in randomly chosen sets of structures,empirical relationships were derived for calculating the statistical significance of adiscovered pattern from its size, the <strong>to</strong>tal number of input structures, and thenumber of input structures required <strong>to</strong> contain the pattern. Results were presentedfor nonredundant sets from 17 SCOP superfamilies. It was found that motifs of atleast four residues derived from sets of five or more structures typically corresponded<strong>to</strong> functional sites. Too many additional patterns were obtained from onlypairwise comparisons of evolutionarily related structures. DRESPAT is availablefrom the authors as C++ code (Wangikar et al. 2003).The funClust server (Ausiello et al. 2008) (Table 8.1) identifies 3D motifs sharedby multiple input structures. Up <strong>to</strong> 20 structures can be used. The structures are filteredby sequence identity and then compared all-by-all <strong>with</strong> Query3D (Ausiello etal. 2005a). Query3D uses two points per residue, alpha-carbon and side chain centroid.Besides the maximum sequence identity, users can specify whether cu<strong>to</strong>ffs inRMSD and side chain proximity should be low, medium, or high; whether buriedor hydrophobic residues should be excluded; and whether similar residue typesshould be allowed <strong>to</strong> pair in addition <strong>to</strong> identical ones. The server reports motifs ofthree or more residues found in three or more of the input structures.The PAR-3D (<strong>Protein</strong> Active site Residues using 3-Dimensional structuralmotifs) server (Goyal et al. 2007) (Table 8.1) compares an uploaded structure <strong>to</strong>motifs for six classes of proteases, ten glycolytic pathway enzymes, and metal-bindingsites of three or four residues (Goyal and Mande 2008). The motifs, each generated

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!