4 Multiple Sequence Alignment 4.1 Multiple sequence alignment

More documents

Recommendations

Info

44 Grundlagen der Bioinformatik, SS’09, D. Huson, May 10, 2009The distance-based SP-score of the profile alignment A ∗ is:D sp (A ∗ ) =∑ L∑s(a ∗ pi, a ∗ qi) =1≤p
Grundlagen der Bioinformatik, SS’09, D. Huson, May 10, 2009 45Sequences are aligned bottom-up along the guide tree, first aligning pairs of sequences, then sequencesagainst profiles (sub-alignments) and then profiles against profiles.Different algorithms use different methods to compute the guide tree.4.9.4 Feng-DoolittleA first progressive alignment algorithm was published in 1987 by Feng and Doolittle 1 .Algorithm 4.9.31. Calculate all ( r2)pairwise alignment scores and convert them into distances.2. Construct a rooted guide tree from the distance matrix using the “Fitch–Margoliash” algorithm.3. Build a multiple alignment bottom-up along the guide tree and return the alignment of all sequencesthat is produced at the root of the tree.The distance score used by Feng-Doolittle is:whereD = − log S eff = − log S obs − S randS max − S rand,• S obs is the observed similarity score for a pair of sequences,• S max is the maximum possible score, and• S rand is the expected score of an alignment of two random sequences of the same length andcomposition.The “effective score” S eff can be viewed as a normalised percentage similarity.The sequence-sequence alignments are conducted using the profile alignment approach.4.9.5 CLUSTALWCLUSTALW 2 is still one of the most popular programs for computing an MSA, although more recentmethods such as T-Coffee or Muscle are designed to produce better alignments in practice.Algorithm 4.9.4 (ClustalW progressive alignment) 1. Construct a distance matrix of all ( )r2pairs by pairwise dynamic programming alignment followed by approximate conversion of similarityscores to evolutionary distances.2. Construct a guide tree using the Neighbor-Joining tree-building method from the distance matrix.3. Progressively align sequences at nodes of tree in order of decreasing similarity, using sequencesequence,sequence-profile and profile-profile alignment.1 Feng, D-F & Doolittle, RF. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol.Evol. 25:351-360, 19872 Thompson, J.D., Higgins, D.G. & Gibson, T.J. CLUSTAL W: improving the sensitivity of progressive multiplesequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic AcidsResearch, 22:4673-4680, 1997.Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. & Higgins,D.G. The ClustalX windows interface: flexiblestrategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 24:4876-4882, 1997.
Page 1 and 2: Grundlagen der Bioinformatik, SS’
Page 11: Grundlagen der Bioinformatik, SS’

4 Multiple Sequence Alignment 4.1 Multiple sequence alignment

Create successful ePaper yourself

Delete template?

Save as template?