Refined Buneman Trees
Refined Buneman Trees
Refined Buneman Trees
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Algorithm 2 The Neighbor-Joining algorithm<br />
Require: δ is a distance matrix on n species X<br />
Ensure: T is a unrooted binary tree with leaves from X<br />
1: initialise T such that for each species x i ∈ X there is a leaf node n i ∈ T<br />
2: define a set of active nodes L containing the leaves of T<br />
3: while |L| > 2 do<br />
4: find nodes n i ,n j ∈ L such that<br />
D ij = d(i, j) − (r i + r j ) where r i = ∑ k∈L<br />
d(i, k)<br />
|L|−2<br />
is minimal<br />
5: remove nodes n i ,n j from L<br />
6: create a new node n k and set distances from n k to remaining nodes n m ∈ L<br />
such that d(k, m) =(d(i, m)+d(j, m) − d(i, j))/2<br />
7: add n k to L, T and connect n k to n i ,n j such that |e ik | = 1 2 (d(i, j)+r i−r j )<br />
and |e jk | = 1 2 (d(i, j) − r i + r j )<br />
8: end while<br />
9: join the last two nodes n i ,n j ∈ L such that |e ij | = d(i, j)<br />
4.1.3 Quartet based methods<br />
<strong>Buneman</strong> trees ([Bun71]) are a new form 1 of distance method that relies on<br />
quartets and splits rather than just picking closest pairs. An algorithm with a<br />
running time complexity of O(n 3 ) is available, but not given here (see [BB99],<br />
section3). Another method which relies on quartets is the Q ∗ method proposed<br />
by Berry & Gascuel ([BG00]) which runs in time O(n 4 ).<br />
The latter illustrates the problem for quartet based methods: given a set<br />
of quartets, these quartets do not nescessarily support a tree, e.g. they might<br />
contain contradictory constraints such as quartets wanting to split species in<br />
opposite ways. In the <strong>Buneman</strong> case, a set of splits is generated from quartets<br />
which are guaranteed to be tree-consistent — in the case of Q ∗ , a tree-consistent<br />
set of quartets is found by weeding out in a larger set of favorable quartets. The<br />
intuition is that by considering quartets, we are looking at the species in a global<br />
sense, determining which species should be separated from other species and<br />
trying to construct a tree under a large number of such (possibly conflicting)<br />
constraints. This is in many ways the opposite of the intuition behind the<br />
clustering methods mentioned earlier.<br />
Another issue for both of these algorithms is that they produce only partially<br />
resolved trees. Compared to the UPGMA and NJ methods we might say that<br />
these quartet based methods only resolve “safe” branches where the clustering<br />
methods resolve trees fully, regardless of data. However, we also have to say that<br />
these methods are perhaps too safe since they might only infer a small fraction<br />
of edges in the evolutionary tree, depending on data ([BG00], [BB99]). The<br />
1 “new form” compared to the clustering methods UPGMA and Neighbor-Joining<br />
28