23.02.2015 Views

Machine Learning - DISCo

Machine Learning - DISCo

Machine Learning - DISCo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

hypothesis h' in the power set that is identical to h except for its classification of<br />

x. And of course if h is in the version space, then h' will be as well, because it<br />

agrees with h on all the observed training examples.<br />

2.7.3 The Futility of Bias-Free <strong>Learning</strong><br />

The above discussion illustrates a fundamental property of inductive inference:<br />

a learner that makes no a priori assumptions regarding the identity of the target<br />

concept has no rational basis for classifying any unseen instances. In fact,<br />

the only reason that the CANDIDATE-ELIMINATION algorithm was able to generalize<br />

beyond the observed training examples in our original formulation of the<br />

EnjoySport task is that it was biased by the implicit assumption that the target<br />

concept could be represented by a conjunction of attribute values. In cases where<br />

this assumption is correct (and the training examples are error-free), its classification<br />

of new instances will also be correct. If this assumption is incorrect, however,<br />

it is certain that the CANDIDATE-ELIMINATION algorithm will rnisclassify at least<br />

some instances from X.<br />

Because inductive learning requires some form of prior assumptions, or<br />

inductive bias, we will find it useful to characterize different learning approaches<br />

by the inductive biast they employ. Let us define this notion of inductive bias<br />

more precisely. The key idea we wish to capture here is the policy by which the<br />

learner generalizes beyond the observed training data, to infer the classification<br />

of new instances. Therefore, consider the general setting in which an arbitrary<br />

learning algorithm L is provided an arbitrary set of training data D, = {(x, c(x))}<br />

of some arbitrary target concept c. After training, L is asked to classify a new<br />

instance xi. Let L(xi, D,) denote the classification (e.g., positive or negative) that<br />

L assigns to xi after learning from the training data D,. We can describe this<br />

inductive inference step performed by L as follows<br />

where the notation y + z indicates that z is inductively inferred from y. For<br />

example, if we take L to be the CANDIDATE-ELIMINATION algorithm, D, to be<br />

the training data from Table 2.1, and xi to be the fist instance from Table 2.6,<br />

then the inductive inference performed in this case concludes that L(xi, D,) =<br />

(EnjoySport = yes).<br />

Because L is an inductive learning algorithm, the result L(xi, D,) that it infers<br />

will not in general be provably correct; that is, the classification L(xi, D,) need<br />

not follow deductively from the training data D, and the description of the new<br />

instance xi. However, it is interesting to ask what additional assumptions could be<br />

added to D, r\xi so that L(xi, D,) would follow deductively. We define the inductive<br />

bias of L as this set of additional assumptions. More precisely, we define the<br />

t~he term inductive bias here is not to be confused with the term estimation bias commonly used in<br />

statistics. Estimation bias will be discussed in Chapter 5.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!