22.01.2015 Views

Refined Buneman Trees

Refined Buneman Trees

Refined Buneman Trees

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.6 Splits<br />

The partition of a finite set S into two non-empty parts U and V is denoted a<br />

split σ = U|V .If|U| =1or|V | = 1 the split is called trivial. It is reasonable to<br />

represent a split as a bitvector or binary number, and for convention we shall<br />

say that for a bitvector A representing σ, x i ∈ U if and only if A[i] = 0. Splits<br />

are symmetric, so if w represents the split σ, then¬w also represents σ. Theset<br />

of splits on a set X is denoted σ(X). The size of σ(X) is the number of unique<br />

splits on X, so|σ(X)| = |P(X)|−2<br />

2<br />

= 2n −2<br />

2<br />

=2 n−1 − 1. We exclude symmetric<br />

splits and deduct the two splits where U = ∅ or V = ∅.<br />

The set of quartets associated with a split σ = U|V on a set X is defined<br />

by q(σ) ={uu ′ |vv ′ : u, u ′ ∈ U ∧ v, v ′ ∈ V }. Hereu, u ′ (and similarly v, v ′ ) need<br />

not be distinct. The size of q(U|V ) is in the order of O(|X| 4 ) — recall that an<br />

edge in a tree induces O(|X| 4 ) quartets, and splits are equivalent to edges in<br />

this case.<br />

Definition 3 (Compatibility). Two splits A|B and C|D are said to be compatible<br />

if and only if one of A ∩ C, A ∩ D, B ∩ C or B ∩ D is empty.<br />

Compatible sets of splits are the foundation for the algorithm presented in<br />

this thesis, and they are a perfect tool when dealing with evolutionary trees.<br />

A set of splits is compatible if and only if all splits in the set are pairwise<br />

compatible. And of course, any subset of a compatible set of splits is again<br />

compatible.<br />

There is a close connection between compatible sets of splits and evolutionary<br />

trees. Any edge e in an unrooted tree T splits the set of leaves of T into two<br />

non-empty parts. Let Σ(T ) denote the set of splits associated with the edges of<br />

atreeT . Then Theorem 1 (from [SS03]) gives the relation between compatible<br />

sets of splits and evolutionary trees.<br />

Theorem 1 (Splits-Equivalence Theorem). Let Σ be a collection of splits<br />

on X. Then, there is an evolutionary tree T such that Σ=Σ(T ) if and only if<br />

Σ is a compatible set of splits. If T exists it is unique up to isomorphisms.<br />

From now on we shall use the terms compatible set of splits/ evolutionary<br />

tree and split/ edge interchangeably. They are one and the same: Table 2.1<br />

shows a compatible set of (weighted) splits, and Figure 2.6 shows the equivalent<br />

evolutionary tree. Recall the discussion of evolutionary trees versus phylogenetic<br />

trees; when working with a method such as the refined <strong>Buneman</strong> tree algorithm<br />

which outputs compatible sets of splits which might or might now correspond<br />

to a fully resolved tree, it is important to be able to such a tree in a precise<br />

manner. When dealing with e.g Neighbor-Joining, we can rely on the more<br />

regular phylogenetic trees since the NJ method always resolves trees completely.<br />

Lemma 1 is due to Dan Gusfield ([Gus91], section 1.2) and gives an important<br />

upper bound for the time required to go from compatible sets of splits to<br />

phylogenetic trees.<br />

Lemma 1. An unrooted tree with n leaves can be constructed from its set of<br />

non-trivial splits in time O(kn), wherek is the number of non-trivial splits.<br />

16

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!