12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

8 3D Motifs 191To improve the signal for detecting functionally relevant motifs, residue conservationin sequence alignments and spatial clustering of the residues in a given motifare often considered as well.8.2.2 Motif Description and MatchingPoints in a 3D motif are either a<strong>to</strong>ms or pseudoa<strong>to</strong>ms derived directly from thea<strong>to</strong>m positions of a structure. A side chain centroid, for example, is simply a pseudoa<strong>to</strong>mat the average position of the a<strong>to</strong>ms in the side chain. Up <strong>to</strong> a few pointsare used per residue in the motif, and the points are labelled <strong>with</strong> additional informationsuch as a<strong>to</strong>m type, residue type, or physicochemical characteristics.When a structure is searched for matches <strong>to</strong> a 3D motif, qualitative rules governwhich points in the structure are allowed <strong>to</strong> pair <strong>with</strong> which points in the motif, andgeometric cu<strong>to</strong>ffs determine which sets of points are sufficiently spatially similar <strong>to</strong>be considered a match, or hit. Match stringency also depends on the numbers of residuesand points in the motif. There is a tradeoff between match stringency and <strong>to</strong>lerance:it may be desirable <strong>to</strong> allow for residue substitutions, conformational flexibility,and low structural resolution, but doing so will increase the number of biologicallymeaningless hits along <strong>with</strong> the hard-<strong>to</strong>-find meaningful ones. Including specifica<strong>to</strong>m positions in a 3D motif emphasizes localized interactions such as hydrogenbonding, whereas using centroids of functional groups or side chains is more accommodatingof flexibility and type substitutions (Fig. 8.1). Representing side chaingroups that are symmetrical, e.g., the aromatic ring in Phe, as a single point alsoprecludes having <strong>to</strong> compare them in multiple ways (Oldfield 2002).Searching can be computationally intensive, especially considering that thousandsof structures may be compared <strong>to</strong> thousands of motifs. Three-dimensionalmotif searching has relied on the development of efficient algorithms, often involvingone or more of the following:●Geometric hashing. Hashing is a somewhat broad term for reducing complexdata <strong>to</strong> a simpler form that can be compared more rapidly. Multiple values suchas distances, angles, and residue types can be collapsed <strong>with</strong> a function in<strong>to</strong>fewer numbers or even a single number. Other sets of values that reduce <strong>to</strong> thesame result correspond <strong>to</strong> potentially matching substructures. Geometric hashingencodes spatial relationships among points (Fischer et al. 1994), but otherkinds of information such as physicochemical descrip<strong>to</strong>rs can be included(Shulman-Peleg et al. 2004). Individual substructure matches that imply similartransformations (translations/rotations <strong>to</strong> superimpose the paired points) can becollated in<strong>to</strong> larger groups before the more computationally intensive steps oftransformation and scoring are performed (Pennec and Ayache 1998). Hashingor preprocessing the data takes time, but only needs <strong>to</strong> be done once per structureand can greatly speed up comparisons.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!