07.02.2013 Views

Bioinformatics Algorithms: Techniques and Applications

Bioinformatics Algorithms: Techniques and Applications

Bioinformatics Algorithms: Techniques and Applications

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

SUPERTREES AND SUPERNETWORKS 157<br />

By using tree mappings introduced in Section 7.2.1, we can define methods for<br />

supertree inference that are based on the idea of retaining a largest set of taxa obtained<br />

by removing those taxa that induce conflicts among all trees or contradictory rooted<br />

triples. These methods naturally lead to extend to the case of a supertree the notions<br />

of agreement <strong>and</strong> compatible subtree discussed in the previous section.<br />

A complementary approach to compute a supertree requires that all taxa appearing<br />

in at least one input tree must necessarily appear also in the output supertree, where<br />

all information encoded in the input trees must be present. Also for this approach, the<br />

notion of tree mapping (especially of tree refinement) is central for formally defining<br />

the idea of information preservation.<br />

7.4.1 Models <strong>and</strong> Problems<br />

The simplest <strong>and</strong> more general problem that arises in supertree inference is the construction<br />

of a compatible supertree.<br />

PROBLEM 7.2 Compatible Supertree<br />

Input: a set T ={T1,...,Tk} of phylogenetic trees.<br />

Output: a tree T displaying all trees in T .<br />

This formulation has the drawback that such a supertree is not guaranteed to exist,<br />

even though the problem seems quite easy to solve, as we are looking for a tree T<br />

whose set of clusters contains those of the input trees. Moreover, such a supertree<br />

exists if <strong>and</strong> only if no two input clusters (possibly in different trees) are overlapping.<br />

Please notice that the problem is much harder on unrooted trees than on rooted trees;<br />

in fact, computing (if it exists) a compatible unrooted supertree displaying all input<br />

trees not only is NP-hard [35] but also cannot be solved by any generic algorithm<br />

(without time constraints!) invariant with respect to permutations of leaves’ labels<br />

[36].<br />

By requiring that clusters of the supertree displaying all trees preserve some strict<br />

relationships between clusters of the input trees, we obtain a variant of the Compatible<br />

supertree problem that is related to the agreement subtree method.<br />

PROBLEM 7.3 Total Agreement Supertree<br />

Input: a set T ={T1,...,Tk} of phylogenetic trees, with Ti leaf-labeled over �(Ti).<br />

Output: a phylogenetic tree T leaf-labeled over S =∪i≤k�(Ti) such that each tree<br />

T |�(Ti) is homeomorphic to Ti.<br />

Observe that in the total agreement supertree problem, the computed tree T is such<br />

that C(T |�(Ti)) = C(Ti) while given the output tree T ′ of the Compatible supertree<br />

problem, it holds that C(Ti) is included in C(T ′ |�(Ti)).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!