13.07.2015 Views

computing the quartet distance between general trees

computing the quartet distance between general trees

computing the quartet distance between general trees

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

20 CHAPTER 4. QUARTIC TIME ALGORITHMbT baT aCT ccFigure 4.1: A center node C of <strong>the</strong> leaves a,b,c. The white sub<strong>trees</strong> make up <strong>the</strong> setof sub<strong>trees</strong> denoted T rest.<strong>the</strong> topologies found in <strong>the</strong> two <strong>trees</strong>, an array is computed, holding, at position i , <strong>the</strong>topology of <strong>the</strong> <strong>quartet</strong> containing a, b, c and <strong>the</strong> i ’th leaf. Given two of those arrays,one for T and one for T ′ , <strong>the</strong> topologies can be compared in linear time and <strong>the</strong> numberof different topologies associated with <strong>the</strong> triplet (a,b,c) counted. This gives an overallrunning time of O(n 4 ).A number of different <strong>quartet</strong> topologies is thus computed by processing all triplets(a,b,c). However, each <strong>quartet</strong> (a,b,c,d) is actually considered four times; once for eachpossible triplet composed from <strong>the</strong> four leaves – in o<strong>the</strong>r words, each of <strong>the</strong> four leaveswill eventually act as <strong>the</strong> leaf x, described above. Consequently, <strong>the</strong> total number of different<strong>quartet</strong>s counted has to be divided by four to get <strong>the</strong> <strong>quartet</strong> <strong>distance</strong>. See Alg. 4.1for an outline of <strong>the</strong> algorithm.Regarding space consumption, this approach only needs memory for <strong>the</strong> tree datastructure and <strong>the</strong> two arrays, which are all linear in <strong>the</strong> number of leaves. Thus, <strong>the</strong>algorithm uses O(n) space.4.1 ImplementationThe quartic algorithm is simple and easy to understand and this counts for <strong>the</strong> implementationof it as well. The algorithm is a bit clumsy by nature, making a lot of traversalsof <strong>the</strong> tree, but since it is <strong>the</strong> starting point for my investigation of <strong>quartet</strong> <strong>distance</strong> calculation,my focus will be on correctness ra<strong>the</strong>r than efficiency, and thus, I will make noattempt to optimize <strong>the</strong> algorithm in any way. Having a solid foundation and referencepoint for comparison with <strong>the</strong> o<strong>the</strong>r implementations is important to gain confidence in<strong>the</strong> final results.The main loop, line 2 of Algorithm 4.1, goes through all distinct triplets, which can begenerated in various ways. The centers of a tree are found by first making three traversals

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!