12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6 <strong>Function</strong> Diversity Within Folds and Superfamilies 147assignments of individual domains can differ between the two databases becauseof the degree of subjectivity in each definition (i.e. which secondary structure elementsare <strong>to</strong> be considered major), and of the pro<strong>to</strong>cols used <strong>to</strong> assign the domains(au<strong>to</strong>mated in CATH, mostly manual in SCOP).FSSP – A purely objective definition of folds has been offered by FSSP(families of structurally similar proteins) (Holm and Sander 1996b). In FSSP,pair wise structural alignments were performed for a set of representative andnon-redundant PDB structures using the structural alignment program DALI(Holm and Sander 1993). Hierarchical clustering was applied using the scoresobtained from these pair wise structural alignments thus generating a so-calledfold tree, from which fold families were au<strong>to</strong>matically defined by cutting the treeat different levels of similarity.6.2.1.3 Paradigm Shift<strong>Structure</strong> is generally better conserved than sequence in evolution, and many proteinsdisplay common structural characteristics. As more and more three-dimensional proteinstructures were being solved in the mid-1990s, structural classification systemsbecame necessary in order <strong>to</strong> make some sense out of the increasing amount of data.This lead <strong>to</strong> the development of the above-mentioned hierarchical classifications ofprotein structures. The realisation that global structural motifs, such as the (β/α) 8barrelsor the 4-helix bundles, were observed in proteins that were unrelated in sequence,lead <strong>to</strong> the notion of fold that we have just described. Until recently, folds have beenunders<strong>to</strong>od as recurrent global structural motifs that incidentally act as practical divisionsof the protein structure space. Implicit in that view is the idea that fold space isdiscrete, in the sense that (a) each protein has a unique fold, which it will share <strong>with</strong>other related proteins, and which will distinguish it from most other unrelated proteins(though accounting for the existence of analogous folds, see Section 6.2.2.1); and (b)that each fold is structurally different and constitutes an isolated and non-overlappingstructural group from the others (Kolodny et al. 2006).But as more and more structural data becomes available, notably via structuralgenomics initiatives, the perception of the fold is changing in favour of a view of foldspace that is continuous rather than discrete (Harrison et al. 2002). It is now becomingwidely recognised that homologous proteins can actually adopt different folds (Grishin2001; Kolodny et al. 2006), and that some proteins can adopt multiple, changeable foldingmotifs depending on time and conditions (Andreeva and Murzin 2006). This hasconsequences on the usability of the fold for function prediction; the main argument forusing fold similarities when inferring function is that proteins sharing the same fold mayoften display remote homologies that would not be detectable otherwise, and thathomologous proteins should in turn tend <strong>to</strong> perform related functions (Moult andMelamud 2000). It necessarily follows from the finding that the relationship betweenfold and homology is not clear, that the relationship between fold and function is likely <strong>to</strong>be fuzzy as well. However, recent results obtained using the ensemble of currently availablestructural data in CATH suggest that the majority of folds are structurally coherent

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!