09.02.2015 Views

SplitsTree

SplitsTree

SplitsTree

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>SplitsTree</strong>4 – a Java Framework<br />

for Phylogenetic Trees and<br />

Networks<br />

Daniel Huson<br />

www-ab.informatik.uni-tuebingen.detuebingen.de


Copyright (c) 2008 Daniel Huson.<br />

Permission is granted to copy, distribute and/or modify this document<br />

under the terms of the GNU Free Documentation License, Version 1.2<br />

or any later version published by the Free Software Foundation;<br />

with no Invariant Sections, no Front-Cover Texts, and no Back-Cover<br />

Texts. A copy of the license can be found at http://www.gnu.org/copyleft/fdl.html


Trees vs networks<br />

• Evolutionary relationships are usually<br />

represented by phylogenetic trees<br />

• But: real data contain different and/or<br />

conflicting signals, and thus do not always<br />

clearly support a unique tree<br />

• Enter phylogenetic networks, as either:<br />

• simply a visualization of conflicting data, or<br />

• a more complex model of evolution containing<br />

events such as recombination or hybridization


Trees vs Networks<br />

Domain<br />

Bacteria<br />

Eukaryotes<br />

Archaea<br />

Bacteria<br />

Eukaryotes<br />

Archaea<br />

Kingdom<br />

Proteobacteria<br />

Cyanobacteria<br />

Animals<br />

Fungi<br />

Plants<br />

Archezoa<br />

Euryarchacota<br />

Crenarchaeota<br />

Proteobacteria<br />

Cyanobacteria<br />

Animals<br />

Fungi<br />

Plants<br />

Archezoa<br />

Euryarchacota<br />

Crenarchaeota<br />

Doolittle, 1999<br />

Doolittle, 1999<br />

Tree of life<br />

Web of life


Computed using<br />

Neighbor-Joining<br />

Trees vs Networks<br />

Computed using<br />

split decomposition<br />

Neisseria phylogeny (Eddie Holmes, 1999)


Trees and splits<br />

The split encoding (T) of a tree T:<br />

G 6<br />

G 1<br />

G 4<br />

G 8<br />

e<br />

G 5<br />

G 2<br />

G 7<br />

G 3<br />

G 1 ,G 3 ,G 4 ,G 6 ,G 7 vs G 2 ,G 5 ,G 8


Networks and splits<br />

Cut-set of parallel edges defines split { {A,B}} vs rest


Splits and splits graphs<br />

• Any given system of splits can be<br />

represented by a splits graph G. G<br />

Note that:<br />

• G is a tree iff is compatible<br />

(e.g. Neighbor-Joining)<br />

• G is outer- labeled- planar iff is circular<br />

(e.g. Neighbor-Net, Net, Bryant & Moulton 2002)<br />

• G is usually planar or only mildy non-planar iff<br />

is weakly compatible (e.g. Split Decomposition)<br />

• G is always subgraph of n-dim. hypercube<br />

(e.g. recoding of sequences, spectral analysis, median networks,<br />

consensus networks, Z-super networks)<br />

(Theory of splits worked out by Bandelt and Dress 1992)


<strong>SplitsTree</strong> 3. 2<br />

Implements split<br />

decomposition and<br />

related methods<br />

First version developed<br />

with Rainer Wetzel in<br />

1995<br />

Current version 3.2 in<br />

C++ using Tcl-TkTk<br />

Runs under Linux, Unix,<br />

Windows and MacOS


Design criteria for <strong>SplitsTree</strong>4<br />

• Must run on any machine with minimal<br />

installation requirements<br />

• GUI for interactive use, command- line for<br />

pipelines<br />

• Open system, decentralized plug-in concept<br />

• Based on splits, also including quartets etc<br />

• Based on Nexus file format, with support<br />

for most common formats


Data flow in <strong>SplitsTree</strong><br />

Taxa<br />

Taxa are<br />

represented e.g. by<br />

aligned sequences<br />

Unaligned<br />

Assumptions<br />

Characters<br />

Bootstrap<br />

Transform<br />

characters into<br />

distances e.g. using<br />

Hamming Transform distances<br />

splits in<br />

Transform<br />

to unrooted or<br />

Every connector<br />

distances into splits<br />

rooted graph or tree<br />

represents a e.g. data<br />

using Neighbor-<br />

Analysis<br />

transformation<br />

net<br />

(plug-in)<br />

Distances<br />

Quartets<br />

Trees<br />

Splits<br />

<br />

Graph


Writing a new transformation<br />

A new tree-building method “BestTree” is provided to<br />

<strong>SplitsTree</strong> as follows:<br />

pub l i c c l a s s Be s t T r e e i mp l emen t s D i s t anc e s2T r e e<br />

{<br />

/ / r e t u r ns t r ue , i f Be s t T r e e i s app l i c ab l e<br />

pub l i c boo l e an i sApp l i c ab l e ( Ta x a t , D i s t anc e s d )<br />

{ … }<br />

/ / app l i e s Be s t T r e e and r e t u r ns t he t r e e<br />

pub l i c T r e e app l y ( Ta x a t , D i s t anc e s d ) { … }<br />

/ / commun i c a t i ng op t i ons t o Sp l i t sT r e e :<br />

i n t ge t Op t i onTh r e sho l d ( ) { … }<br />

vo i d s e t Op t i onTh r e sho l d ( i n t t ) { … }<br />

}


<strong>SplitsTree</strong> Windows


<strong>SplitsTree</strong> Windows<br />

Taxa<br />

Unaligned<br />

Characters<br />

Distances<br />

Quartets<br />

Trees<br />

Splits<br />

Main Window<br />

Method Window


<strong>SplitsTree</strong> Editor


Individual Gene Trees<br />

ITS00<br />

46 taxa


Individual Gene Trees<br />

ITS03<br />

40 taxa


Individual Gene Trees<br />

SSU00<br />

29 taxa


Individual Gene Trees<br />

SSU03<br />

40 taxa


Individual Gene Trees<br />

Gpd03<br />

40 taxa


Gene Trees as Super Network


Gene Trees as Super Network<br />

ITS00+<br />

ITS03


Gene Trees as Super Network<br />

ITS03+<br />

SSU00


Gene Trees as Super Network<br />

ITS00+<br />

ITS00+<br />

SSU03


Gene Trees as Super Network<br />

ITS00+<br />

ITS03+<br />

SSU03+<br />

Gpd03


Gene Trees as Super Network<br />

ITS00+<br />

ITS03+<br />

SSU00+<br />

SSU03+<br />

Gpd03


Exponential Explosion<br />

Methods like the consensus network,<br />

Z-super network or bootstrap-network<br />

only produce a polynomial number of<br />

splits<br />

The number of nodes and edges of the<br />

corresponding splits graph can grow<br />

exponentially…<br />

How to deal with this


Incompatibility Graph IG()<br />

Nodes: splits<br />

Edges: pairs of incompatible splits<br />

Note: A 3-cube in the<br />

splits graph corresponds<br />

to a 3-clique in IG( )


All splits of 50 Gene Trees on Archaea<br />

D=2<br />

D=3<br />

D=4<br />

D=5<br />

D=10


Hybridization Networks<br />

Currently developing algorithms for<br />

computing hybridization networks and<br />

ancestor recombination graphs


Summary<br />

<strong>SplitsTree</strong> provides a frame-work for<br />

phylogenetic analysis<br />

Extensibility based on plug-in design<br />

Built on splits, incorperates both tree and<br />

network methods<br />

Provides all popular distance-based tree<br />

building algorithms<br />

Provides network methods such as split<br />

decomposition, Neighbor-net, consensus<br />

networks and super networks.


Credits<br />

Authors: David Bryant and D.H.<br />

Additional programming:<br />

Tobias Dezulian, Markus Franz,<br />

Miguel Jette,Tobias Kloepper, and<br />

Michael Schröder

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!