13.07.2015 Views

computing the quartet distance between general trees

computing the quartet distance between general trees

computing the quartet distance between general trees

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3Experimental approachAlgorithms for <strong>computing</strong> <strong>the</strong> <strong>quartet</strong> <strong>distance</strong> should all produce <strong>the</strong> same simple result,namely a single number. Once confidence in one algorithm has been established,it is ra<strong>the</strong>r simple to verify ano<strong>the</strong>r such algorithm and build <strong>the</strong> same level of trust in<strong>the</strong> correctness of its results. That is one reason for implementing several different algorithmsalong with each o<strong>the</strong>r, and so is a strong motivation for <strong>the</strong> inclusion of <strong>the</strong>quartic algorithm of Chap. 4, because it is simple and <strong>the</strong>refore easy to verify by hand.And thorough verification has indeed been practiced, with various small examples andunit tests.Ano<strong>the</strong>r way to gain trust, is of course to rely on some independent implementation.This has also been done in <strong>the</strong> effort to cement <strong>the</strong> correctness of my work. A piece ofsoftware called QDist, described in Mailund and Pedersen [13] has been utilized. It isan implementation of <strong>the</strong> O(n log 2 n) algorithm presented in Brodal et al. [2], which isworking only on binary <strong>trees</strong>. However, since <strong>the</strong>re was no working software for <strong>general</strong><strong>trees</strong> at hand, all results involving <strong>general</strong> <strong>trees</strong> had to be verified separately.After implementing three algorithms using completely different approaches, in twodifferent programming languages, my confidence in <strong>the</strong> correctness is very high.Needless it is to say that correctness of <strong>the</strong> result is essential, but yet ano<strong>the</strong>r allimportantproperty is <strong>the</strong> running time. Especially to this <strong>the</strong>sis, where running timeis <strong>the</strong> quality measure, by which <strong>the</strong> algorithms are compared. Experiments are usedto determine <strong>the</strong> running time of each algorithm, as to verify whe<strong>the</strong>r <strong>the</strong> <strong>the</strong>oreticalpromised time bounds are correct, assess <strong>the</strong> practical behaviour of <strong>the</strong> algorithms, whichmight be different than anticipated, and last, to be able to compare <strong>the</strong> algorithms againstone ano<strong>the</strong>r. For <strong>the</strong>se experiments, a wide range of test data has been used. This isexplained in detail in <strong>the</strong> following Sec. 3.1. O<strong>the</strong>r details about each experiment, e.g.which input <strong>trees</strong> have been used and how many times an experiment has been exe-13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!