12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

146 B.H. Dessailly and C.A. Orengothis general statement since there are no objective rules <strong>to</strong> decide which are themain elements of secondary structure <strong>to</strong> be considered for defining the fold(Grishin 2001).One objective of this chapter is <strong>to</strong> describe how knowledge of relationshipsbetween proteins, such as sharing the same fold, helps in transferring functionalannotations from well-characterised proteins <strong>to</strong> proteins of unknown function. Aswill be discussed further in Section 6.3, the main assumption made in the processof transferring annotations between proteins is that evolutionarily related (i.e.homologous) proteins generally tend <strong>to</strong> share functional properties. But proteinsadopting the same fold are not necessarily homologous. It has been argued thatproteins can attain a given fold independently by convergent evolution, becauseonly a limited number of folds are physically acceptable (Russell et al. 1997). Forexample, it is not clear whether all superfamilies of proteins that adopt the TIM-like(β/α) 8barrel fold are evolutionarily related, as definitive evidence in that sense hasnot been found (Nagano et al. 2002).6.2.1.2 Practical DefinitionsSeveral databases have been set-up <strong>to</strong> classify protein structures in<strong>to</strong> a comprehensiveframework of structural relationships. The practical definition of a foldused in the most-widely used of these databases is given below. As will emergefrom the following definitions, the concept of fold is generally applied <strong>to</strong>domains rather than full-length proteins, but the definition of a domain can varybetween databases.CATH – The CATH database is a hierarchical classification of protein domainstructures (Orengo et al. 1997; Greene et al. 2007). The highest level of classificationassigns protein domains <strong>to</strong> three different classes based on their global contentin secondary structures. Within CATH classes, protein domains are classified in<strong>to</strong>different architectures that describe the orientation of secondary structures <strong>with</strong>outconsidering their connectivity. Domains in a given architecture are further sub-classifiedin<strong>to</strong> different <strong>to</strong>pologies, depending on how secondary structures are connected<strong>to</strong> one another. It is this <strong>to</strong>pology level that fits most closely <strong>to</strong> the generalnotion of a fold described above. In practise, assignment of domains <strong>to</strong> the <strong>to</strong>pologiesin CATH is performed au<strong>to</strong>matically <strong>with</strong> the structural alignment programSSAP (Orengo and Taylor 1996) and empirically derived cut-offs.SCOP – Like CATH, the Structural Classification of <strong>Protein</strong>s (SCOP) is ahierarchical classification of protein domain structures (Murzin et al. 1995;Andreeva et al. 2008), but the levels of classification differ between the two databases.As in CATH, the highest level of classification in SCOP is the structuralclass, but SCOP defines four different classes whereas CATH defines three. Thenext level of classification is the fold; two protein domains are assigned <strong>to</strong> thesame fold if they share the same major elements of secondary structure arrangedin a similar orientation, and <strong>with</strong> the same <strong>to</strong>pological connections. This definitioncorresponds well <strong>to</strong> the definition of the <strong>to</strong>pology level in CATH but, in practise,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!