12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3 Comparative <strong>Protein</strong> <strong>Structure</strong> Modelling 63programs in the class include FASTA (Pearson 2000) and BLAST (Schaffer et al.2001). To improve the sensitivity of the sequence based searches evolutionaryinformation can be incorporated in form of multiple sequence alignment (Altschulet al. 1997; Henikoff et al. 2000; Krogh et al. 1994; Marti-Renom et al. 2004;Rychlewski et al. 2000). These approaches begin by finding all sequences in asequence database that are clearly related <strong>to</strong> the target and easily aligned <strong>with</strong> it.The multiple alignment of these sequences is the target sequence profile, whichimplicitly carries additional information about the location and pattern of evolutionaryconserved positions of the protein. The most well known program in thisclass is PSI-BLAST(Altschul et al. 1997), which implements a heuristic searchalgorithm for short motifs. A further step <strong>to</strong> increase the sensitivity of this approachis <strong>to</strong> pre-calculate sequence profiles for all the known structures and then use pairwisedynamic programming algorithm <strong>to</strong> compare the two profiles. This has beenimplemented, among other programs, in COACH (Edgar and Sjolander 2004) andin FFAS03 (Jaroszewski et al. 1998, 2005). The construction of profile-basedHidden Markov Models (HMM) is another sensitive way <strong>to</strong> locate universally conservedmotifs among sequences (Karplus et al. 1998). A substantial improvementin HMM approaches was achieved by incorporating information about predictedsecondary structural elements (Karchin et al. 2003; Karplus et al. 2005). Anotherdevelopment in this group of methods is the phylogenetic tree-driven HMM, whichselects a different subset of sequences for profile HMM analysis at each node in theevolutionary tree (Edgar and Sjolander 2003). Locating sequence intermediates thatare homologous <strong>to</strong> both sequences may also enhance the template searches (Johnand Sali 2004; Sauder et al. 2000). These more sensitive fold identification techniquesare especially useful for finding significant structural relationships whensequence identity between the target and the template drops below 25%. Moreaccurate sequence profiles and structural alignments can be constructed <strong>with</strong> consistency-basedapproaches such as T-Coffee (Moretti et al. 2007) PROMAL (andPROMAL3D for structures) (Pei and Grishin 2007; Pei et al. 2008), ProbCons (Doet al. 2005) etc. For recent reviews of multiple sequence alignments see (Edgar andBatzoglou 2006; Notredame 2007).The second class of methods relies on pairwise comparison of a proteinsequence and a protein structure; the target sequence is matched against alibrary of 3D profiles or threaded through a library of 3D folds. These methodsare also called fold assignment, threading or 3D template matching (Bowieet al. 1991; Finkelstein and Reva 1991; Jaroszewski et al. 1998; Jones 1999; Shiet al. 2001; Sippl 1995). These methods, discussed in detail in Chapter 2, areespecially useful when sequence profiles are not possible <strong>to</strong> construct becausethere are not enough known sequences that are clearly related <strong>to</strong> the target orpotential templates.Template search methods “outperform” the needs of comparative modelling inthe sense that they are able <strong>to</strong> locate sequences that are so remotely related as <strong>to</strong>render construction of a reliable comparative model impossible. The reason for thisis that sequence relationships are often established on short conserved segments,while a successful comparative modelling exercise requires an overall correct alignmentfor the entire modelled part of the protein. This is an important distinction

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!