01.04.2015 Views

Comparative Genomics-Basic and Applied Research.pdf

Comparative Genomics-Basic and Applied Research.pdf

Comparative Genomics-Basic and Applied Research.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

34 <strong>Comparative</strong> <strong>Genomics</strong><br />

quickly but producing a good one is time consuming as most optimization criteria<br />

are nondeterministic polynomial-time hard (NP-hard). Reliability, the ability of a<br />

reconstruction method to return accurate answers on entirely new data sets rather<br />

than just those on which it has been tested (<strong>and</strong> often developed), remains largely<br />

unexplored; while systematists are accustomed to getting so-called bootstrap<br />

scores for their tree edges or estimates of distributions of trees from their Markov<br />

chain Monte Carlo (MCMC) methods, the predictive value of the reconstruction<br />

methods <strong>and</strong> the significance on any given sample data set of these quality measures<br />

remain mostly unknown.<br />

Surprises have been encountered time after time as the scale of reconstruction<br />

increased; thus, current methods, even if reliably accurate within their current ranges<br />

(something we do not know), are not likely to remain so as we move to larger scales.<br />

3.1.5 RECONSTRUCTING THE TREE OF LIFE<br />

Many biologists have been calling for some time for a community effort to attempt<br />

the reconstruction of the tree of life, the phylogeny of all organisms on this planet.<br />

Such an endeavor naturally has no end since evolution is an ongoing process <strong>and</strong> is<br />

not particularly well defined since thous<strong>and</strong>s of organisms become extinct every<br />

year, if not every day. The scale is truly daunting: While we have methods that can<br />

reconstruct phylogenies for up to a thous<strong>and</strong> leaves (<strong>and</strong> scale poorly beyond that),<br />

there are well over a million described species of organisms, <strong>and</strong> estimates of the<br />

existing number vary from ten million to several hundred millions. Finally, it is not<br />

clear that we need a single giant phylogeny; many of the branches of this phylogeny<br />

are well identified <strong>and</strong> broadly accepted <strong>and</strong> so could be investigated mostly independently<br />

of all others. Yet, the tree of life should hold a special place in the heart<br />

of every human: It describes the wonderful diversity of life on this planet, helps us<br />

underst<strong>and</strong> where we humans come from <strong>and</strong> what is our place within the larger<br />

scheme of life, <strong>and</strong> most importantly, gives us a basis to underst<strong>and</strong> where we are all<br />

heading. The project to reconstruct this phylogeny also motivates the community to<br />

revisit many aspects of phylogenetic analysis, particularly those that have to do with<br />

scaling <strong>and</strong> reliability. After all, there is only one tree of life for this planet, so there<br />

will not soon be a chance to compare our reconstruction with one done for another<br />

tree of life elsewhere.<br />

In the United States, the National Science Foundation initiated the Assembling<br />

the Tree of Life program that has funded, to date, well over 30 groups collecting,<br />

filtering, <strong>and</strong> analyzing data on all branches of the tree. Through another program, it<br />

has also enabled the Cyberinfrastructure for Phylogenetic <strong>Research</strong> (CIPRES) project<br />

(www.phylo.org), with the aim to develop the informatics infrastructure (software<br />

framework, databases, analysis modules, workflow, <strong>and</strong> hardware platform)<br />

necessary to attack the computational problems that the community will face in<br />

attempting a reconstruction of the tree of life. Many other research groups throughout<br />

the world are working on the tree of life in some form. The resulting surge of<br />

interest in large-scale phylogenetic reconstruction from combinatorialists, statisticians,<br />

algorithm designers, high-performance computing specialists, <strong>and</strong> of course,<br />

biologists <strong>and</strong> biomedical researchers has begun to yield spectacular results.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!