13.01.2013 Views

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

Calcium-Binding Protein Protocols Calcium-Binding Protein Protocols

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Multiple Sequence Alignment 237<br />

the direction and origin of evolution. Some background information on phylogenetic<br />

analysis is given in Note 6.<br />

4. Notes<br />

4.1. Methodology Notes<br />

1. In general, two basic classes of alignment programs have been developed: global<br />

and local methods. Global alignment programs attempt to align the sequences<br />

over their whole length, whereas local programs search only for the most conserved<br />

regions and leave the other parts of the sequences unaligned. The most<br />

effective alignment algorithm depends on the nature of the sequences to be<br />

aligned. Global algorithms produce the most accurate and reliable alignments<br />

when all the sequences in the data set are of similar length. However, when the<br />

sequences differ greatly in length, local alignment programs are often more successful<br />

at identifying the conserved regions.<br />

The two most-explored computational techniques for multiple-sequence alignment<br />

are Dynamic Programming (DP) (39,40) and, more recently, Hidden<br />

Markov Modeling (HMM) (41,42). The DP technique guarantees the finding of<br />

the highest scoring alignment determined from summing amino acid substitution<br />

scores minus any insertion/deletion penalties. HMM is a statistical approach,<br />

which is powerful if applied to sequence database searches. However, HMM<br />

approaches for multiple sequence alignment generally perform poorly when compared<br />

to other methods (43,44), mainly because of the inherently complex<br />

parameterization of the technique. As a consequence, the state-of-the-art multiple<br />

alignment methods are all based on the DP technique.<br />

Some recent evaluations of available multiple alignment techniques have been<br />

carried out (45) using a versatile database of benchmark alignments called<br />

BAliBASE (45). These showed the method PRRP (46) to be marginally the most<br />

accurate, closely followed by CLUSTALX, which is a much faster program. Virtually<br />

the same accuracy as CLUSTALX was attained by the PRALINE method<br />

when run on default parameters, not utilizing strategies such as profile-preprocessing<br />

or predicted secondary structure induced alignment. Other methods<br />

included in the assessment tests generally fell behind, such as the local alignment<br />

method DIALIGN (47), the HMM-based method HMMT (48), or the Gibbs-sampling<br />

method GIBBS (49). It must be stressed that DIALIGN was relatively successful<br />

in aligning sequence with very large insertions or deletions. A further<br />

discussion of alignment strategies and associated methods can be found in<br />

Appendix I.<br />

2. SRS (Sequence Retrieval System) (23) is a powerful front-end program to access<br />

a large number of popular sequence databases. In addition to sequences, one can<br />

search based on motifs such as those within the PROSITE or PFAM databases.<br />

Care must be taken, however, as the algorithms used to create these databases<br />

may include false-positive results, and exclude false-negative ones. The<br />

documentation must be read carefully to establish which sequences are included/<br />

excluded (see Subheading 4.2. for addition practical considerations when using

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!