Refined Buneman Trees
Refined Buneman Trees
Refined Buneman Trees
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2.6 Splits<br />
The partition of a finite set S into two non-empty parts U and V is denoted a<br />
split σ = U|V .If|U| =1or|V | = 1 the split is called trivial. It is reasonable to<br />
represent a split as a bitvector or binary number, and for convention we shall<br />
say that for a bitvector A representing σ, x i ∈ U if and only if A[i] = 0. Splits<br />
are symmetric, so if w represents the split σ, then¬w also represents σ. Theset<br />
of splits on a set X is denoted σ(X). The size of σ(X) is the number of unique<br />
splits on X, so|σ(X)| = |P(X)|−2<br />
2<br />
= 2n −2<br />
2<br />
=2 n−1 − 1. We exclude symmetric<br />
splits and deduct the two splits where U = ∅ or V = ∅.<br />
The set of quartets associated with a split σ = U|V on a set X is defined<br />
by q(σ) ={uu ′ |vv ′ : u, u ′ ∈ U ∧ v, v ′ ∈ V }. Hereu, u ′ (and similarly v, v ′ ) need<br />
not be distinct. The size of q(U|V ) is in the order of O(|X| 4 ) — recall that an<br />
edge in a tree induces O(|X| 4 ) quartets, and splits are equivalent to edges in<br />
this case.<br />
Definition 3 (Compatibility). Two splits A|B and C|D are said to be compatible<br />
if and only if one of A ∩ C, A ∩ D, B ∩ C or B ∩ D is empty.<br />
Compatible sets of splits are the foundation for the algorithm presented in<br />
this thesis, and they are a perfect tool when dealing with evolutionary trees.<br />
A set of splits is compatible if and only if all splits in the set are pairwise<br />
compatible. And of course, any subset of a compatible set of splits is again<br />
compatible.<br />
There is a close connection between compatible sets of splits and evolutionary<br />
trees. Any edge e in an unrooted tree T splits the set of leaves of T into two<br />
non-empty parts. Let Σ(T ) denote the set of splits associated with the edges of<br />
atreeT . Then Theorem 1 (from [SS03]) gives the relation between compatible<br />
sets of splits and evolutionary trees.<br />
Theorem 1 (Splits-Equivalence Theorem). Let Σ be a collection of splits<br />
on X. Then, there is an evolutionary tree T such that Σ=Σ(T ) if and only if<br />
Σ is a compatible set of splits. If T exists it is unique up to isomorphisms.<br />
From now on we shall use the terms compatible set of splits/ evolutionary<br />
tree and split/ edge interchangeably. They are one and the same: Table 2.1<br />
shows a compatible set of (weighted) splits, and Figure 2.6 shows the equivalent<br />
evolutionary tree. Recall the discussion of evolutionary trees versus phylogenetic<br />
trees; when working with a method such as the refined <strong>Buneman</strong> tree algorithm<br />
which outputs compatible sets of splits which might or might now correspond<br />
to a fully resolved tree, it is important to be able to such a tree in a precise<br />
manner. When dealing with e.g Neighbor-Joining, we can rely on the more<br />
regular phylogenetic trees since the NJ method always resolves trees completely.<br />
Lemma 1 is due to Dan Gusfield ([Gus91], section 1.2) and gives an important<br />
upper bound for the time required to go from compatible sets of splits to<br />
phylogenetic trees.<br />
Lemma 1. An unrooted tree with n leaves can be constructed from its set of<br />
non-trivial splits in time O(kn), wherek is the number of non-trivial splits.<br />
16