22.01.2015 Views

Refined Buneman Trees

Refined Buneman Trees

Refined Buneman Trees

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

the pseudo-code for the algorithm is given in Algorithm 4.<br />

The algorithm is very long and complicated, but it can be broken up into<br />

three parts: lines 1–3 deal with initializing linked lists representing “global<br />

matrix fronts” on matrices which contain diagonal quartets. Lines 4–17 populate<br />

the matrix fronts. In lines 18–28 we search the matrices, finding the minimum<br />

quartets that we need to calculate the refined <strong>Buneman</strong> indices for the edges<br />

represented by the matrices. And in lines 29–34 we report refined <strong>Buneman</strong><br />

splits. Before we explain the algorithm, we need to study the nature of diagonal<br />

quartets.<br />

5.2.1 Searching for minimum diagonals<br />

We know from previous sections that a quartet induces two diagonal quartets.<br />

And clearly, given a diagonal quartet we can identify its “parent” quartet. In<br />

the pruning part of our algorithm, we are interesting in finding the n−3quartets<br />

with minimum score induced by each edge in T . We will do this by searching<br />

for diagonal quartets instead, but we need to ensure that we never identify the<br />

same quartet twice (for example by identifying it from seeing both of its diagonal<br />

quartets).<br />

We will use a convention that says we shall only identify a quartet if we<br />

see its minimum diagonal. In case we see a diagonal quartet which is not a<br />

minimum diagonal, we shall disregard it.<br />

Another important property of diagonal quartets is this: if we fix a and c,<br />

such that a and c lie on different sides of a fixed edge e, we can search for b and<br />

d independently to minimize the score of the diagonal quartet ab||cd induced by<br />

e. Byfixinga and c we can rewrite the score of a diagonal quartet into a sum<br />

of two functions f a,c and g a,c , such that f a,c only depends on b and g a,c only<br />

depends on d. Clearly, such a function takes its minimum only when f a,c and<br />

g a,c are minimal.<br />

η ab||cd = 1 2 (δ bc − δ ab + δ ad − δ dc )=f a,c (b)+g a,c (d)<br />

where f a,c (b) =(δ bc − δ ab )/2 andg a,c (d) =(δ ad − δ dc )/2.<br />

Not only can we find the diagonal quartet with minimum score in this way.<br />

But also, we can search for the “next minimum”, i.e. the diagonal quartet with<br />

minimum score when discounting the actual minimum. We can do this in a<br />

general way: say we have some diagonal quartet ab i ||cd j with score η abi||cd j<br />

.<br />

Imagine we have considered all diagonal quartets with scores less than η abi||cd j<br />

,<br />

and now we wish to consider ab i ||cd j and then find the next diagonal quartet<br />

with minimum score.<br />

The way to do this is to search for b i+1 such that η abi+1||cd j<br />

≥ η abi||cd j<br />

is the<br />

minimum among all choices of b i+1 , and similarly for d j+1 , η abi||cd j+1<br />

≥ η abi||cd j<br />

must be the smallest among choices of d j+1 . One of those will be the next<br />

minimum. Note that the indices refer to an ordering of increasing f a,c and g a,c<br />

respectively, not ordering as members of X.<br />

36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!