12.07.2015 Views

4 Multiple Sequence Alignment 4.1 Multiple sequence alignment

4 Multiple Sequence Alignment 4.1 Multiple sequence alignment

4 Multiple Sequence Alignment 4.1 Multiple sequence alignment

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Grundlagen der Bioinformatik, SS’09, D. Huson, May 10, 2009 41For example, consider: A = {CGCTTTA, ACGTT, GCTAG}. Assume 0 for a match score and +1 for amismatch, deletion and insertion. Then D(A 1 , A 2 ) = 4, D(A 1 , A 3 ) = 4 and D(A 2 , A 3 ) = 4. Choosec = A 1 as center. First align A 1 and A 2 :Then align A ′ 1 and A 3:Combine both to obtain the following <strong>alignment</strong>:-CGCTTTAACG--TT--CGCTTTA---GC--TAG-CGCTTTA-ACG--TT----GC--TAGLet A ∗ c denote the multiple <strong>alignment</strong> obtained by successively aligning all other <strong>sequence</strong>s to thecenter <strong>sequence</strong> A c .Theorem 4.6.2 If the pairwise distances satisfy the triangle inequality, then D(A ∗ c) < 2D SP (A ∗ ).(Proof: see Gusfield, pg. 350)4.7 <strong>Multiple</strong> <strong>alignment</strong> to a treeDefinition 4.7.1 (Phylogenetic <strong>alignment</strong> tree) Suppose we are given a set of <strong>sequence</strong>s A ={A 1 , A 2 , . . . , A r } and a tree T A = (V, E) whose leaves are labeled by A. Let each internal node u of Twith children v and w be labeled with an ancestral <strong>sequence</strong> of the labels of v and w. Then T is calleda phylogenetic <strong>alignment</strong> tree of A.Example: Let A = {bog,dog,hag,bad}hodboghadbog dog hag badTree T with <strong>sequence</strong>s at leaves→bog dog hag badA phylogenetic <strong>alignment</strong> tree of AEach edge in a phylogenetic <strong>alignment</strong> tree T can be assigned a distance:Definition 4.7.2 (edge distance) If e = (U, V ) is an edge that joins <strong>sequence</strong>s U and V , then wedefine the edge distance of e as the edit distance D L (U, V ). The distance D(T ) of a phylogenetic<strong>alignment</strong> T is the sum of its edge distances.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!