SplitsTree
SplitsTree
SplitsTree
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>SplitsTree</strong>4 – a Java Framework<br />
for Phylogenetic Trees and<br />
Networks<br />
Daniel Huson<br />
www-ab.informatik.uni-tuebingen.detuebingen.de
Copyright (c) 2008 Daniel Huson.<br />
Permission is granted to copy, distribute and/or modify this document<br />
under the terms of the GNU Free Documentation License, Version 1.2<br />
or any later version published by the Free Software Foundation;<br />
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover<br />
Texts. A copy of the license can be found at http://www.gnu.org/copyleft/fdl.html
Trees vs networks<br />
• Evolutionary relationships are usually<br />
represented by phylogenetic trees<br />
• But: real data contain different and/or<br />
conflicting signals, and thus do not always<br />
clearly support a unique tree<br />
• Enter phylogenetic networks, as either:<br />
• simply a visualization of conflicting data, or<br />
• a more complex model of evolution containing<br />
events such as recombination or hybridization
Trees vs Networks<br />
Domain<br />
Bacteria<br />
Eukaryotes<br />
Archaea<br />
Bacteria<br />
Eukaryotes<br />
Archaea<br />
Kingdom<br />
Proteobacteria<br />
Cyanobacteria<br />
Animals<br />
Fungi<br />
Plants<br />
Archezoa<br />
Euryarchacota<br />
Crenarchaeota<br />
Proteobacteria<br />
Cyanobacteria<br />
Animals<br />
Fungi<br />
Plants<br />
Archezoa<br />
Euryarchacota<br />
Crenarchaeota<br />
Doolittle, 1999<br />
Doolittle, 1999<br />
Tree of life<br />
Web of life
Computed using<br />
Neighbor-Joining<br />
Trees vs Networks<br />
Computed using<br />
split decomposition<br />
Neisseria phylogeny (Eddie Holmes, 1999)
Trees and splits<br />
The split encoding (T) of a tree T:<br />
G 6<br />
G 1<br />
G 4<br />
G 8<br />
e<br />
G 5<br />
G 2<br />
G 7<br />
G 3<br />
G 1 ,G 3 ,G 4 ,G 6 ,G 7 vs G 2 ,G 5 ,G 8
Networks and splits<br />
Cut-set of parallel edges defines split { {A,B}} vs rest
Splits and splits graphs<br />
• Any given system of splits can be<br />
represented by a splits graph G. G<br />
Note that:<br />
• G is a tree iff is compatible<br />
(e.g. Neighbor-Joining)<br />
• G is outer- labeled- planar iff is circular<br />
(e.g. Neighbor-Net, Net, Bryant & Moulton 2002)<br />
• G is usually planar or only mildy non-planar iff<br />
is weakly compatible (e.g. Split Decomposition)<br />
• G is always subgraph of n-dim. hypercube<br />
(e.g. recoding of sequences, spectral analysis, median networks,<br />
consensus networks, Z-super networks)<br />
(Theory of splits worked out by Bandelt and Dress 1992)
<strong>SplitsTree</strong> 3. 2<br />
Implements split<br />
decomposition and<br />
related methods<br />
First version developed<br />
with Rainer Wetzel in<br />
1995<br />
Current version 3.2 in<br />
C++ using Tcl-TkTk<br />
Runs under Linux, Unix,<br />
Windows and MacOS
Design criteria for <strong>SplitsTree</strong>4<br />
• Must run on any machine with minimal<br />
installation requirements<br />
• GUI for interactive use, command- line for<br />
pipelines<br />
• Open system, decentralized plug-in concept<br />
• Based on splits, also including quartets etc<br />
• Based on Nexus file format, with support<br />
for most common formats
Data flow in <strong>SplitsTree</strong><br />
Taxa<br />
Taxa are<br />
represented e.g. by<br />
aligned sequences<br />
Unaligned<br />
Assumptions<br />
Characters<br />
Bootstrap<br />
Transform<br />
characters into<br />
distances e.g. using<br />
Hamming Transform distances<br />
splits in<br />
Transform<br />
to unrooted or<br />
Every connector<br />
distances into splits<br />
rooted graph or tree<br />
represents a e.g. data<br />
using Neighbor-<br />
Analysis<br />
transformation<br />
net<br />
(plug-in)<br />
Distances<br />
Quartets<br />
Trees<br />
Splits<br />
<br />
Graph
Writing a new transformation<br />
A new tree-building method “BestTree” is provided to<br />
<strong>SplitsTree</strong> as follows:<br />
pub l i c c l a s s Be s t T r e e i mp l emen t s D i s t anc e s2T r e e<br />
{<br />
/ / r e t u r ns t r ue , i f Be s t T r e e i s app l i c ab l e<br />
pub l i c boo l e an i sApp l i c ab l e ( Ta x a t , D i s t anc e s d )<br />
{ … }<br />
/ / app l i e s Be s t T r e e and r e t u r ns t he t r e e<br />
pub l i c T r e e app l y ( Ta x a t , D i s t anc e s d ) { … }<br />
/ / commun i c a t i ng op t i ons t o Sp l i t sT r e e :<br />
i n t ge t Op t i onTh r e sho l d ( ) { … }<br />
vo i d s e t Op t i onTh r e sho l d ( i n t t ) { … }<br />
}
<strong>SplitsTree</strong> Windows
<strong>SplitsTree</strong> Windows<br />
Taxa<br />
Unaligned<br />
Characters<br />
Distances<br />
Quartets<br />
Trees<br />
Splits<br />
Main Window<br />
Method Window
<strong>SplitsTree</strong> Editor
Individual Gene Trees<br />
ITS00<br />
46 taxa
Individual Gene Trees<br />
ITS03<br />
40 taxa
Individual Gene Trees<br />
SSU00<br />
29 taxa
Individual Gene Trees<br />
SSU03<br />
40 taxa
Individual Gene Trees<br />
Gpd03<br />
40 taxa
Gene Trees as Super Network
Gene Trees as Super Network<br />
ITS00+<br />
ITS03
Gene Trees as Super Network<br />
ITS03+<br />
SSU00
Gene Trees as Super Network<br />
ITS00+<br />
ITS00+<br />
SSU03
Gene Trees as Super Network<br />
ITS00+<br />
ITS03+<br />
SSU03+<br />
Gpd03
Gene Trees as Super Network<br />
ITS00+<br />
ITS03+<br />
SSU00+<br />
SSU03+<br />
Gpd03
Exponential Explosion<br />
Methods like the consensus network,<br />
Z-super network or bootstrap-network<br />
only produce a polynomial number of<br />
splits<br />
The number of nodes and edges of the<br />
corresponding splits graph can grow<br />
exponentially…<br />
How to deal with this
Incompatibility Graph IG()<br />
Nodes: splits<br />
Edges: pairs of incompatible splits<br />
Note: A 3-cube in the<br />
splits graph corresponds<br />
to a 3-clique in IG( )
All splits of 50 Gene Trees on Archaea<br />
D=2<br />
D=3<br />
D=4<br />
D=5<br />
D=10
Hybridization Networks<br />
Currently developing algorithms for<br />
computing hybridization networks and<br />
ancestor recombination graphs
Summary<br />
<strong>SplitsTree</strong> provides a frame-work for<br />
phylogenetic analysis<br />
Extensibility based on plug-in design<br />
Built on splits, incorperates both tree and<br />
network methods<br />
Provides all popular distance-based tree<br />
building algorithms<br />
Provides network methods such as split<br />
decomposition, Neighbor-net, consensus<br />
networks and super networks.
Credits<br />
Authors: David Bryant and D.H.<br />
Additional programming:<br />
Tobias Dezulian, Markus Franz,<br />
Miguel Jette,Tobias Kloepper, and<br />
Michael Schröder