12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

44 L.A. KelleyAs an example, the Naive Bayes classifier selects the most likely classificationF nbgiven the attribute values s 1, s 2, … s n. This results in:In general P(s i| f j) is estimated as:∏F = argmax P(f ) P(s | f )nb f j ∈ F j iP(s | f ) = n+mp ci jn+mWhere:n = The number of training examples for which f = f jn c= Number of examples for which f = f jand s = s ip = A priori estimate for P(s i| f j)m = The equivalent sample size (a weighting term for the prior)There are clear similarities between this sort of approach and the methods describedearlier for the generation of empirical energy functions.In contrast <strong>to</strong> generative classifiers where probabilities are determined from thetraining examples, discriminative classifiers attempt <strong>to</strong> directly maximise predictiveaccuracy on a training set. Neural networks and support vec<strong>to</strong>r machines (SVMs)are discriminative classifiers which have been used extensively in computationalbiology (e.g. Busuttil et al. 2004; Garg et al. 2005; Nguyen and Rajapakse 2003;Bradford and Westhead 2005).SVMs determine a decision boundary, or hyperplane, that can separate the inputdata in<strong>to</strong> one of two classes (e.g. fold A or not fold A) based on the value of thefeature vec<strong>to</strong>r s. In most difficult problems, the data is not separable using a linearfunction of the input features. SVMs cope <strong>with</strong> non-linearity by using a kernelfunction k(s t, s j) which measures the similarity of pairs of input examples s i, s j.During training, every example, positive and negative, is compared <strong>to</strong> every otherusing the kernel function, producing an n × n matrix of similarity values given n trainingexamples. The trick is that the kernel function, which can often be quite simple andfast <strong>to</strong> compute, takes the data in<strong>to</strong> a higher dimensional feature space where itcan now be linearly separated. The decision boundary determined in this way iscomposed usually of only a handful of the training examples that lie on the decisionboundary itself, and these are known as the support vec<strong>to</strong>rs as they ‘support’ theboundary like struts can support a building.SVMs have been used for remote homology detection, including SVM-Fisher(Jaakkola et al. 2000), SVM-k-spectrum (Leslie et al. 2002), SVM-pairwise (Liao andNoble 2003), SVM I-sites (Hou et al. 2003) and SVM-mismatch (Leslie et al. 2004).All of these techniques are in a sense ‘pure’ recognition techniques where nofinal alignment is produced <strong>to</strong> permit modelling. Instead they merely assign asequence <strong>to</strong> a class <strong>with</strong> some probability. This can be useful in some cases, bu<strong>to</strong>ften one desires a three-dimensional model of the query sequence and thus someadditional system is required for the (non-trivial) alignment stage.j

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!