13.01.2013 Views

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

240 Weljie and Heringa<br />

lineages. The method of Saitou and Nei (37) (called NJ), is acclaimed by many<br />

workers in phylogenetic analysis. The method relies on a protocol of progressive<br />

pairwise joining of nearest sequences (each sequence being represented by a<br />

node) such that each time two nodes are joined, they are represented by an internal<br />

node. The two nodes selected at each step for joining are those that keep the<br />

overall tree length at a minimum. The NJ method has the advantage over the<br />

UPGMA technique in that it does not use the evolutionary (dis)similarity among<br />

groups, but merely is a strategy to join sequences and calculate branch lengths<br />

without assumptions regarding the rate of evolution. A general advantage of distance<br />

methods over parsimony and maximum likelihood methods is that they<br />

usually are much less CPU intensive as they employ a fixed strategy to arrive at a<br />

final tree without the need to sample the complete and vast tree space.<br />

The simplest way to calculate the distance between sequences is by using<br />

the percent divergence, which for two aligned sequences involves counting<br />

the number of nonidentical matches (ignoring positions with gaps) divided<br />

by the number of positions considered. When a multiple alignment is used,<br />

all positions containing a gap in any of the sequences are commonly ignored.<br />

The real evolutionary time between the divergence of two sequences depends<br />

on the speed of the evolutionary clock, a matter of ongoing controversy. Even<br />

under a uniform clock, sequence identity as a measure of distance underestimates<br />

the real number of mutations. Certainly in diverged sequences there is<br />

an increasing chance that multiple substitutions have occurred at a site. The<br />

greater the divergence, the more the evolutionary times are underestimated.<br />

Kimura (59) corrected for this effect by curve fitting such that the corrected<br />

evolutionary time from the distance K (percent divergence divided by 100) is<br />

given by corrected K = –ln(1.0 – K – K 2 / 5.0). The formula applies to cases<br />

with a reasonably uniform evolutionary clock and sequence identies from 15%<br />

and fits the data well from identities higher than 35%. It is good practice to<br />

start tree-building routines from such corrected sequence distances, which<br />

can be done by the PHYLIP package (60–62).<br />

Because evolutionary trees may be a result of local traps in the search space, it<br />

is important to estimate the significance of a particular tree topology and associated<br />

branch lengths. Felsenstein (63) introduced the concept of bootstrapping,<br />

which involves resampling of the data such that the alignment positions are randomly<br />

selected and placed in some order and then a tree is generated by the<br />

original method. This process is repeated a statistically significant number of<br />

times. Comparison of frequencies at which the N-3 internal branches (N is the<br />

number of sequences) occur in the original tree and those from bootstrapping<br />

allow probability estimates of significance. It is common practice to consider<br />

frequencies higher than or equal to 95% as supportive for the occurrence of an<br />

original tree branch in the bootstrapped trees. In this way, bootstrapping tests the<br />

stability of groupings given the data set and the method, thus lessening the chance<br />

of incorrect tree structures caused by conservative and/or back mutations.<br />

Bootstrapping can be performed for any method that generates a tree from a mul-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!