22.01.2015 Views

Refined Buneman Trees

Refined Buneman Trees

Refined Buneman Trees

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Algorithm 2 The Neighbor-Joining algorithm<br />

Require: δ is a distance matrix on n species X<br />

Ensure: T is a unrooted binary tree with leaves from X<br />

1: initialise T such that for each species x i ∈ X there is a leaf node n i ∈ T<br />

2: define a set of active nodes L containing the leaves of T<br />

3: while |L| > 2 do<br />

4: find nodes n i ,n j ∈ L such that<br />

D ij = d(i, j) − (r i + r j ) where r i = ∑ k∈L<br />

d(i, k)<br />

|L|−2<br />

is minimal<br />

5: remove nodes n i ,n j from L<br />

6: create a new node n k and set distances from n k to remaining nodes n m ∈ L<br />

such that d(k, m) =(d(i, m)+d(j, m) − d(i, j))/2<br />

7: add n k to L, T and connect n k to n i ,n j such that |e ik | = 1 2 (d(i, j)+r i−r j )<br />

and |e jk | = 1 2 (d(i, j) − r i + r j )<br />

8: end while<br />

9: join the last two nodes n i ,n j ∈ L such that |e ij | = d(i, j)<br />

4.1.3 Quartet based methods<br />

<strong>Buneman</strong> trees ([Bun71]) are a new form 1 of distance method that relies on<br />

quartets and splits rather than just picking closest pairs. An algorithm with a<br />

running time complexity of O(n 3 ) is available, but not given here (see [BB99],<br />

section3). Another method which relies on quartets is the Q ∗ method proposed<br />

by Berry & Gascuel ([BG00]) which runs in time O(n 4 ).<br />

The latter illustrates the problem for quartet based methods: given a set<br />

of quartets, these quartets do not nescessarily support a tree, e.g. they might<br />

contain contradictory constraints such as quartets wanting to split species in<br />

opposite ways. In the <strong>Buneman</strong> case, a set of splits is generated from quartets<br />

which are guaranteed to be tree-consistent — in the case of Q ∗ , a tree-consistent<br />

set of quartets is found by weeding out in a larger set of favorable quartets. The<br />

intuition is that by considering quartets, we are looking at the species in a global<br />

sense, determining which species should be separated from other species and<br />

trying to construct a tree under a large number of such (possibly conflicting)<br />

constraints. This is in many ways the opposite of the intuition behind the<br />

clustering methods mentioned earlier.<br />

Another issue for both of these algorithms is that they produce only partially<br />

resolved trees. Compared to the UPGMA and NJ methods we might say that<br />

these quartet based methods only resolve “safe” branches where the clustering<br />

methods resolve trees fully, regardless of data. However, we also have to say that<br />

these methods are perhaps too safe since they might only infer a small fraction<br />

of edges in the evolutionary tree, depending on data ([BG00], [BB99]). The<br />

1 “new form” compared to the clustering methods UPGMA and Neighbor-Joining<br />

28

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!