13.01.2013 Views

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

246 Weljie and Heringa<br />

much larger length of sequences, which provides the opportunity for a greater<br />

number of consensus residues in the C2 family. This leads to many more gapless<br />

columns than found with the EF-hand alignments, which enhances the sampling<br />

during the bootstrapping calculations. Finally, both alignments resulted in phylogenetic<br />

trees that appropriately classify the sequences into subfamilies (see<br />

website).<br />

3. Annexin proteins: The annexins constitute a heavily studied protein family from<br />

a phylogenetic point of view (e.g., refs. 11 and 12) partially because of their<br />

important biological function. Another reason stems from the fact that the annexin<br />

repeat is a reasonably common eukaryotic motif; however, the number of<br />

orthologs and paralogs found provides a considerable challenge in classification<br />

because of the lack of defined identity between proteins from the same species. For<br />

example, 10 human annexins have a mean amino acid identity of 49.8 ±/– 4.1% (12),<br />

and estimates of mutational rates suggest that between very different species (e.g.,<br />

plants and animals) the identity should be much lower. Paradoxically, the annexin<br />

repeat itself is particularly interesting because the sequences are all highly<br />

homologous, similar in length with essentially no gaps or deletions, lending themselves<br />

well to alignment. The alignment of the annexin proteins by PRALINE<br />

and CLUSTALX are given in Fig. 3. The degree of similarity in terms of sequence<br />

length is immediately striking, and on further analysis the conservation of key<br />

residues is also dramatic. The NJ trees created with these alignments showed<br />

reasonable stability on bootstrapping; however, neither alignment lends itself well<br />

to classification of the annexins into appropriate subfamilies. Presumably such<br />

analysis is better suited for alignments based on the entire annexin protein, and<br />

not simply on the repeat itself. It should be noted that the most complete classification<br />

of subfamilies to date has been accomplished through a combination of<br />

DNA and protein analysis, in conjunction with maximum-likelihood methods for<br />

phylogenetic analysis (11).<br />

4.3. Multiple Alignment Method<br />

The problem of finding an optimal or highest scoring alignment of two<br />

sequences was solved three decades ago with the DP technique (39), which<br />

guarantees the finding of the highest scoring alignment determined from summing<br />

amino acid substitution scores minus any insertion/deletion penalties.<br />

The amino substitution weights are normally given as a 20 × 20 matrix, containing<br />

the weights for all possible amino acid exchanges. The insertion/deletion<br />

penalties are used to decrease the alignment score when gaps need to be<br />

made to optimally match the two sequences. Normally, a pair of gap penalties<br />

is used, consisting of an opening penalty used once for each gap and an extension<br />

penalty applied to each incurring gap position. However, when applied to<br />

more than two sequences, the calculation of the optimal alignment by multidimensional<br />

implementations of the basic dynamic programming algorithm for<br />

sequence pairs (64), becomes computationally unfeasible. Even with localized

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!