18.12.2012 Views

Myeloid Leukemia

Myeloid Leukemia

Myeloid Leukemia

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

234 Kohlmann et al.<br />

Fig. 5. Concept of support vector machine (SVM)-based classification. The SVM<br />

operates by mapping the given training set into a possibly high-dimensional feature<br />

space and attempting to locate in that space a plane that separates positive from negative<br />

samples. The hyperplane, i.e., a plane in a space with more than three dimensions,<br />

corresponds to a nonlinear decision boundary in the input space.<br />

3.4.5. Classification of Samples Based on Gene Expression Patterns:<br />

SVM-Based Classification<br />

For classification of microarray data, the support vector machine (SVM)<br />

algorithm can be used. SVMs are learning machines that can perform binary<br />

classification tasks (16–18). A classification task involves training and testing<br />

gene expression profiles, which consist of some data instances. Each instance<br />

in the training set contains “target values” (class labels, i.e., leukemia classes)<br />

and several “attributes” (features, i.e., genes). The goal of this approach is to<br />

produce a model that predicts target values of data instances in the testing set<br />

that are given only the attributes. Applied to gene expression data, an SVM<br />

would begin with a set of genes that have a common function—e.g., genes that<br />

demonstrate differential expression between distinct leukemia subtypes. After<br />

nonlinearly mapping the n-dimensional input space into a high-dimensional<br />

feature space, a linear classifier is constructed in this high-dimensional feature<br />

space (Fig. 5).<br />

Using this training set, a SVM would learn to discriminate between the types<br />

and subtypes of leukemias based on expression data. Having found such a<br />

plane, the SVM can then predict the classification of an unlabeled new sample<br />

by mapping it into the feature space and asking on which side of the separating<br />

plane the example lies (Fig. 6). Then a label is assigned according to its relationship<br />

with the decision boundary (10,19). Multi-class SVM classifiers can<br />

be built with linear kernels using the library LIBSVM version 2.36<br />

(www.csie.ntu.edu.tw/~cjlin/libsvm/) (20).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!