From Protein Structure to Function with Bioinformatics.pdf
From Protein Structure to Function with Bioinformatics.pdf
From Protein Structure to Function with Bioinformatics.pdf
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
8 3D Motifs 205folds; they shared a 3D motif of a<strong>to</strong>ms from a four-residue backbone segment andthree sequentially separated residues (Kobayashi and Go 1997). A similar approachwas used <strong>to</strong> generate consensus binding-site motifs (Nebel et al. 2007). By the timeof this study, many more structures had become available, and adenine mono-, di-,and tri-phosphate complexes were handled separately. Pairwise similarities wereevaluated as the fraction of a<strong>to</strong>ms near ligand that could be assigned <strong>to</strong> correspondingpairs. <strong>Structure</strong>s were clustered using these similarity values, and outliers wereremoved. Within each cluster, only a<strong>to</strong>ms common in all pairwise comparisonswere retained, and their positions were averaged <strong>to</strong> generate a 3D motif. Finally,highly similar motifs were merged. The resulting 13 motifs are derived from 3 <strong>to</strong>20 structures, contain 6 <strong>to</strong> 71 a<strong>to</strong>ms, and correspond in most cases <strong>to</strong> some knownclassification of structure or function. The motif coordinates are available as supplementaryinformation <strong>to</strong> the publication (Nebel et al. 2007).Consensus templates were developed for porphyrin-binding sites usingNes<strong>to</strong>r3D (Nebel 2006). Templates generated by this program can include a<strong>to</strong>ms,pseudoa<strong>to</strong>ms representing functional groups, and “solvent” (actually points on agrid <strong>to</strong> represent the cavity volume). Nes<strong>to</strong>r3D includes a graphical interface andis available for download (Table 8.2). Users must specify a list of PDB files andwhich a<strong>to</strong>ms <strong>to</strong> fit <strong>to</strong> superimpose the structures; several other parameters areoptionally adjustable.An all-by-all comparison of 3,737 phosphate environments from protein-nucleotidecomplexes allowed classification in<strong>to</strong> 476 tight clusters and ten broader groupings(Brakoulias and Jackson 2004). The clusters generally agreed <strong>with</strong> classificationsbased on global structure or function. An efficient clique detection method wasused <strong>to</strong> find corresponding sets of a<strong>to</strong>ms, so it was not necessary <strong>to</strong> use liganda<strong>to</strong>ms <strong>to</strong> superimpose the structures.SOIPPA (Sequence Order-Independent Profile-Profile Alignment) finds sharedpatterns of local structure in pairwise comparisons (Xie and Bourne 2008). A proteinstructure is reduced <strong>to</strong> its alpha-carbons, each associated <strong>with</strong> a geometricpotential value and a profile of allowed substitutions obtained from an au<strong>to</strong>maticallyconstructed sequence alignment. The geometric potential of an alpha-carbonis calculated from its distance <strong>to</strong> the surface and the arrangement of neighbouringalpha-carbons (Xie and Bourne 2007). A possible match between two structuresstarts <strong>with</strong> a pair of points <strong>with</strong> similar geometric potentials; neighbouring pairs canbe added if their distances and surface normal angles are consistent. Each alphacarbonpair is weighted by the similarity of their substitution profiles, and a maximumweightcommon subgraph is found. The alignment score after superposition is asum over pairs incorporating pair weight, coincidence in space, and angle betweensurface normals. Statistical significance is estimated <strong>with</strong> a nonparametric model ofthe distribution of scores when the pattern is compared <strong>to</strong> a representative set ofstructures. SOIPPA was used <strong>to</strong> compare diverse adenine-binding structures and <strong>to</strong>search a representative set of structures for matches <strong>to</strong> known functional sites; itwas better able <strong>to</strong> align binding sites and identify local similarities than were globalsequence or structure comparisons. This work focused more on identifying relationshipsthan on 3D motif discovery.