17.12.2012 Views

crc press - E-Lib FK UWKS

crc press - E-Lib FK UWKS

crc press - E-Lib FK UWKS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Signal Peptides 311<br />

14.5.1.1 Weight Matrix Methods<br />

Various methods have been developed for the computational detection of molecular<br />

signals. When the signal to be detected is highly conserved, search of its consensus<br />

sequence, i.e., the pattern represented by regular ex<strong>press</strong>ion such as NX(S/T), would<br />

be sufficient. However, because molecular signals usually allow functionally conservative<br />

mutations and because their degrees of allowance can vary for each position,<br />

the weight matrix (or the position-specific scoring matrix; PSSM) method is<br />

usually favored: namely, based on a compilation of known signals, the contribution<br />

according to their frequency for each residue at each position is stored as a matrix<br />

scanned over the target sequences to find places that give higher summation scores<br />

than a predefined threshold. 212<br />

Weight matrix method, then, is the simplest form of so-called window-based<br />

methods, in which a window of fixed length is examined at each position of the<br />

target sequence. This method was successfully applied to signal peptide prediction<br />

by von Heijne. 213 The matrix corresponds to positions from –13 to +2 in terms of<br />

the cleavage site. Thus, the sequence features of h-region and c-region, especially<br />

the (3, –1) rule, should contribute to the score calculation. Two kinds of matrices<br />

(for prokaryotes and eukaryotes) were constructed but these two were rather similar.<br />

Although von Heijne’s method was originally developed for the detection of cleavage<br />

sites, it is sufficiently useful to detect the presence of (cleavable) signal peptides.<br />

Recently, a prediction method very similar to the weight matrix method was<br />

proposed by Chou. 214 Then, the method was expanded to include the interresidue<br />

correlation around the –3, –1, and +1 positions (the subsite coupling model). 215 That<br />

is, conditional probabilities between these residues were introduced. This method<br />

showed 92% accuracy in detecting signal peptides in an objective test.<br />

14.5.1.2 Artificial Neural Network-Based Methods<br />

A direct way to improve the simple weight matrix method is to let pattern representation<br />

incorporate the effect of internal gaps or the correlation between different<br />

positions. Artificial neural networks (ANNs) have been used in various problems of<br />

bioinformatics to meet the latter need. 211,216 The ANN method is a sort of machinelearning<br />

method in which a number of numeric parameters are iteratively adjusted<br />

(or learned) to better distinguish positive and negative data. The weight matrix<br />

method of signal peptide prediction was elaborated using this method by several<br />

authors. 217,218 Among them, Neilsen et al.’s method has been most successful. 173<br />

In their algorithm, two kinds of networks were constructed for each of the three<br />

systems (Gram-positive bacteria, Gram-negative bacteria, and eukaryotes). One network<br />

outputs the S-score, which shows the likelihood of an input sequence segment<br />

with a fixed length (19 and 27 for bacteria and eukaryotes, respectively) being a<br />

part of a signal peptide, the other network outputs the C-score, which is an indicator<br />

on whether the input segment includes a cleavage site or not. Scanning the input<br />

sequence with the two networks, these two scores are calculated at each position.<br />

By combining these scores with a sort of averaging, the presence of signal peptide

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!