12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

98 T. Nugent and D.T. Jones2000), despite using a single generic scoring matrix, performs well at high sequenceidentities when tested again a benchmark data set of homologous membrane proteinstructures, while HMAP (Tang et al. 2003) can improve alignment significantly usinga profile-profile based approach incorporating structural information.4.6 Transmembrane <strong>Protein</strong> Topology Prediction4.6.1 Alpha-Helical <strong>Protein</strong>sAs previously discussed, the severe under-representation of TM proteins in structuraldatabases makes their study extremely difficult. Given the biological andpharmacological importance of TM proteins, an understanding of their <strong>to</strong>pology –the <strong>to</strong>tal number of TM helices, their boundaries and in/out orientation relative <strong>to</strong>the membrane – is therefore an important target for theoretical prediction methods.A number of experimental methods, including glycosylation analysis, insertiontags, antibody studies and fusion protein constructs, allow the <strong>to</strong>pological locationof a region <strong>to</strong> be identified. However, such studies are time consuming, often conflicting(Mao et al. 2003; Kyttälä et al. 2004), and also risk upsetting the natural<strong>to</strong>pology by altering the protein sequence.In the absence of structural data, bioinformatic strategies thus turn <strong>to</strong> sequencebasedprediction methods. Long before the arrival of the first crystal structures,stretches of hydrophobic residues long enough <strong>to</strong> span the lipid bilayer were identifiedas TM spanning helices. Early prediction methods by Kyte and Doolittle (1982) andEngelman et al. (1986), and later by Wimley and White (1996), relied on experimentallydetermined hydropathy indices <strong>to</strong> create a hydropathy plot for a protein. Thisinvolved taking a sliding window of 19–21 residues and averaging the score <strong>with</strong> peaksin the plots (regions of high hydrophobicity) corresponding <strong>to</strong> TM helices (Fig. 4.3).With more sequences came the discovery that aromatic Trp and Tyr residues tend<strong>to</strong> cluster near the ends of the transmembrane segments (Wallin et al. 1997), possiblyacting as physical buffers <strong>to</strong> stabilise TM helices <strong>with</strong>in the lipid bilayer. Morerecent studies identified the appearance of sequence motifs, such as the GxxxGmotif (Senes et al. 2000), <strong>with</strong>in TM helices and also periodic patterns implicated inhelix-helix packing and 3D structure (Samatey et al. 1995). However, perhaps themost important realisation was that positively-charged residues tend <strong>to</strong> cluster oncy<strong>to</strong>plasmic loop – the ‘positive-inside’ rule of Gunner von Heijne (von Heijne1992). Combined <strong>with</strong> hydrophobicity-based prediction of TM helices, this led <strong>to</strong>early <strong>to</strong>pology prediction methods such as TopPred (Claros and von Heijne 1994).4.6.1.1 Machine Learning-Based ApproachesDespite their success, these early methods based on the physicochemical principleof a sliding window of hydrophobicity combined <strong>with</strong> the ‘positive-inside’ rule

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!