Refined Buneman Trees
Refined Buneman Trees
Refined Buneman Trees
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.4 Hybrid methods<br />
All the methods we have described until now are very one-sided in their bias,<br />
exploring only one side of the trade-off between accuracy and speed. Hybrid<br />
methods do exist, where a combination of a fast method and an accurate one<br />
might produce a practical method with sound biological meaning.<br />
Another method called Disc Covering is described in [HNP + 98], using a<br />
divide-and-conquer type of approach. From the abstract:<br />
(The Disc-Covering Method) DCM obtains a decomposition of the<br />
input dataset into small overlapping sets of closely related taxa,<br />
reconstructs trees on these subsets (using a “base” phylogenetic<br />
method of choice), and then combines the subtrees into one tree on<br />
the entire set of taxa. Because the subproblems analyzed by DCM<br />
are smaller, computationally expensive methods such as maximum<br />
likelihood estimation can be used without incurring too much cost.<br />
4.5 Accuracy of inferred trees<br />
It is possible to evaluate the quality or confidence of branches in the evolutionary<br />
trees we find using our tree reconstruction methods. These statistical methods<br />
are known as bootstrap tests, and they are described [NK00], with references to<br />
other articles.<br />
The basic idea is to find some tree T using some method M, forsomedata<br />
set D. Lets say D consists of n nucleotide sequences. Now we may select n<br />
sequences from D with replacement, to form a new sample dataset D ′ —notice<br />
that the same sequence might occur several times in the new set, while some<br />
sequences might not occur at all. Now we use the new sample to infer a new tree<br />
T ′ bythesamemethodM, andbycomparingT and T ′ we can assign a count<br />
of1tothebranchesinT which also occur in T ′ , and 0 to the rest. Repeating<br />
this process many times (e.g a thousand times) yields a statistic of how often<br />
each branch occurs for different samples, and thus a reflection of how confident<br />
we can be in this particular branch.<br />
31