13.07.2015 Views

computing the quartet distance between general trees

computing the quartet distance between general trees

computing the quartet distance between general trees

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 CHAPTER 1. INTRODUCTIONhas is called <strong>the</strong> degree of <strong>the</strong> node. An unrooted tree does not contain internal nodes ofdegree two. A tree where internal nodes are allowed to be polytomies, that is, <strong>the</strong>y canhave any degree equal to or greater than three, is called a <strong>general</strong> tree. General <strong>trees</strong> areoften used to represent partly resolved relationships where <strong>the</strong> complete topology is notknown and each species <strong>the</strong>refore cannot be represented by a distinct node. Sometimesbranches are assigned a length which adds <strong>the</strong> notion of time to <strong>the</strong> evolution.The true evolutionary relation among a set of EUs is rarely known. Multiple methodsfor determining <strong>the</strong> exact relationship, from biological data, are available. They do notnecessarily agree and might induce different <strong>trees</strong>. Some methods for inferring relationshipswill result in a large range of plausible tree reconstructions. Fur<strong>the</strong>rmore, multipledata sets, e.g. DNA sequences, describing a single species are often at hand. Thus, onemethod may yield a different solution for each data set used. Figure 1.1 is an illustrationof two alternative relationships inferred for <strong>the</strong> Pan<strong>the</strong>ra (big cats).Pan<strong>the</strong>raClouded LeopardJaguarLeopardLionSnow LeopardTigerPan<strong>the</strong>raClouded LeopardTigerJaguarSnow LeopardLeopardLionFigure 1.1: Two alternative views of <strong>the</strong> relationship <strong>between</strong> <strong>the</strong> Pan<strong>the</strong>ra (bigcats). Note that one is a binary tree whereas <strong>the</strong> o<strong>the</strong>r includes a polytomy andis thus a <strong>general</strong> tree. Example from Davis et al. [9].This disagreement <strong>between</strong> <strong>trees</strong> introduces <strong>the</strong> need of some means of assessing<strong>trees</strong>. One approach is to make a pairwise comparison of <strong>trees</strong> in an attempt of quantifying<strong>the</strong> differences or similarities.1.2 Measuring difference or similarityVarious methods for tree comparison have been defined and each measure has certainproperties and takes certain aspects of <strong>the</strong> <strong>trees</strong> into consideration. Some can only handlefully resolved <strong>trees</strong> while o<strong>the</strong>rs are able to take branch lengths into account. Somemetrics consider topological properties only. An example of <strong>the</strong> latter is <strong>the</strong> nearest–neighbor interchange metric, proposed by Waterman and Smith [20], defined as <strong>the</strong>fewest number of nearest–neighbor interchanges required to convert one tree into ano<strong>the</strong>r.The metric only works for binary <strong>trees</strong> and <strong>the</strong> problem of <strong>computing</strong> it has beenshown to be NP-complete (seeDasGupta et al. [8]). In this <strong>the</strong>sis focus will be on <strong>general</strong><strong>trees</strong>. Here, an example is <strong>the</strong> Robinson–Foulds <strong>distance</strong> metric, proposed by Robinsonand Foulds [15], and also known as <strong>the</strong> symmetric difference metric. It is defined as <strong>the</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!