22.01.2015 Views

Refined Buneman Trees

Refined Buneman Trees

Refined Buneman Trees

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

and David Bryant ([BB99], section 3) states the complexity of computing the<br />

anchored <strong>Buneman</strong> tree.<br />

Lemma 2. Let δ be a distance measure on X. ThenB(δ) = ⋂ B x (δ)<br />

Lemma 3. B x (δ) can be computed in time and space O(n 2 ).<br />

2.9 <strong>Refined</strong> <strong>Buneman</strong> tree<br />

Given a split σ for a set of size n, letm = |q(σ)| and let q 1 , ··· ,q m be an ordering<br />

of the elements in q(σ) in non-decreasing order of their <strong>Buneman</strong> scores. Then<br />

the refined <strong>Buneman</strong> index of a split σ is defined as:<br />

μ σ (δ) = 1<br />

n−3<br />

∑<br />

β qi (2.6)<br />

n − 3<br />

i=1<br />

In other words, the refined <strong>Buneman</strong> index of a split is the average over<br />

the n − 3 least scoring quartets. The choice of n − 3 is accredited to divine<br />

intervention — apparently, that was the choice that would make the proof in<br />

[MS99] work. The set of splits RB(δ) ={σ : μ σ (δ) > 0} is a compatible set of<br />

splits ([MS99]). And thus the <strong>Refined</strong> <strong>Buneman</strong> Tree corresponding to a given<br />

dissimilarity measure δ is defined to be the weighted unrooted tree whose edges<br />

represent the splits σ ∈ RB(δ) and are weighted according to μ σ (δ).<br />

When constructing the refined <strong>Buneman</strong> tree as described later in this work,<br />

we shall rely heavily on Lemma 4. The lemma is due to [MS99], and it is used<br />

to maintain compatibility in a set of splits overapproximating the set of refined<br />

<strong>Buneman</strong> splits.<br />

Lemma 4. Given two incompatible splits σ and σ ′ ,<br />

μ σ (δ) ≤ 0 ∨ μ σ ′(δ) ≤ 0<br />

and this can be computed in time O(n).<br />

In the algorithm that computes the refined <strong>Buneman</strong> tree, we are going to<br />

construct a compatible set of splits which is an overapproximation of the set<br />

of refined <strong>Buneman</strong> splits. We shall do this for subsets of X of increasing size.<br />

The way to do this is, after bootstrapping some compatible set of splits, we<br />

shall introduce a set of candidates to go into the overapproximation. However,<br />

to maintain compatibility in the set we shall test the candidate splits against<br />

the existing splits, to find pairs that are incompatible.<br />

Once we find a pair of incompatible splits, we shall use Lemma 4 to determine<br />

which one does not belong in the set. We are not concerned with testing if<br />

one of the splits in the pair actually belongs in the set of refined <strong>Buneman</strong><br />

splits. We are only interested in throwing away candidates that are clearly<br />

incompatible. We can allow ourselves this luxury since we are only creating<br />

an overapproximation of RB(δ| Xk ), not the set itself, and this saves valueable<br />

x∈X<br />

19

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!