28.04.2015 Views

Identification of Pharmacogenomic Biomarker Classifiers in Cancer ...

Identification of Pharmacogenomic Biomarker Classifiers in Cancer ...

Identification of Pharmacogenomic Biomarker Classifiers in Cancer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Feature selection is usually based on identif<strong>in</strong>g the genes that are differentially expressed<br />

among the classes when considered <strong>in</strong>dividually. For example, if there are two classes,<br />

one can compute a t-test or a modified t-test <strong>in</strong> which a hierarchical variance model is<br />

used for <strong>in</strong>creas<strong>in</strong>g the degrees <strong>of</strong> freedom for estimation <strong>of</strong> the gene-specific with<strong>in</strong>class<br />

variances {Wright, 2003 #284}. The logarithm <strong>of</strong> the expression measurements are<br />

used as the basis <strong>of</strong> the statistical significance tests. The genes that are significantly<br />

differentially expressed at a specified significance level are selected for <strong>in</strong>clusion <strong>in</strong> the<br />

class predictor. The str<strong>in</strong>gency <strong>of</strong> the significance level used controls the number <strong>of</strong><br />

genes that are <strong>in</strong>cluded <strong>in</strong> the model. Although many computationally complex methods<br />

have been published to identify optimal sets <strong>of</strong> genes which together provide good<br />

discrim<strong>in</strong>ation, little compell<strong>in</strong>g evidence currently exists that the computational effort <strong>of</strong><br />

these methods is warranted.<br />

Many algorithms have been used effectively with DNA microarray data for predict<strong>in</strong>g <strong>of</strong><br />

a b<strong>in</strong>ary outcome, e.g. response versus non-response. Dudoit et al. {Dudoit, 2002 #42}<br />

compared several algorithms us<strong>in</strong>g several publicly available data sets. A l<strong>in</strong>ear<br />

discrim<strong>in</strong>ant is a function<br />

l( x) = ∑ wx<br />

i i<br />

(1)<br />

i∈F<br />

where x i denotes the logarithm <strong>of</strong> the expression measurement for the i’th gene, w i is the<br />

weight given to that gene, and the summation is over the set F <strong>of</strong> features (genes) selected<br />

6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!