23.02.2015 Views

Machine Learning - DISCo

Machine Learning - DISCo

Machine Learning - DISCo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

additional detail that must be considered: There are many alternative sets of Horn<br />

clauses entailed by the domain theory. The remaining component of the inductive<br />

bias is therefore the basis by which PROLOG-EBG chooses among these alternative<br />

sets of Horn clauses. As we saw above, PROLOG-EBG employs a sequential covering<br />

algorithm that continues to formulate additional Horn clauses until all positive<br />

training examples have been covered. Furthermore, each individual Horn clause<br />

is the most general clause (weakest preimage) licensed by the explanation of the<br />

current training example. Therefore, among the sets of Horn clauses entailed by<br />

the domain theory, we can characterize the bias of PROLOG-EBG as a preference<br />

for small sets of maximally general Horn clauses. In fact, the greedy algorithm of<br />

PROLOG-EBG is only a heuristic approximation to the exhaustive search algorithm<br />

that would be required to find the truly shortest set of maximally general Horn<br />

clauses. Nevertheless, the inductive bias of PROLOG-EBG can be approximately<br />

characterized in this fashion.<br />

Approximate inductive bias of PROLOG-EBG: The domain theory B, plus a preference<br />

for small sets of maximally general Horn clauses.<br />

The most important point here is that the inductive bias of PROLOG-EBGthe<br />

policy by which it generalizes beyond the training data-is largely determined<br />

by the input domain theory. This lies in stark contrast to most of the other learning<br />

algorithms we have discussed (e.g., neural networks, decision tree learning), in<br />

which the inductive bias is a fixed property of the learning algorithm, typically<br />

determined by the syntax of its hypothesis representation. Why is it important<br />

that the inductive bias be an input parameter rather than a fixed property of the<br />

learner? Because, as we have discussed in Chapter 2 and elsewhere, there is<br />

no universally effective inductive bias and because bias-free learning is futile.<br />

Therefore, any attempt to develop a general-purpose learning method must at<br />

minimum allow the inductive bias to vary with the learning problem at hand.<br />

On a more practical level, in many tasks it is quite natural to input domainspecific<br />

knowledge (e.g., the knowledge about Weight in the SafeToStack example)<br />

to influence how the learner will generalize beyond the training data.<br />

In contrast, it is less natural to "implement" an appropriate bias by restricting<br />

the syntactic form of the hypotheses (e.g., prefer short decision trees). Finally,<br />

if we consider the larger issue of how an autonomous agent may improve its<br />

learning capabilities over time, then it is attractive to have a learning algorithm<br />

whose generalization capabilities improve as it acquires more knowledge of its<br />

domain.<br />

11.3.4 Knowledge Level <strong>Learning</strong><br />

As pointed out in Equation (11.2), the hypothesis h output by PROLOG-EBG follows<br />

deductively from the domain theory B and training data D. In fact, by examining<br />

the PROLOG-EBG algorithm it is easy to see that h follows directly from B alone,<br />

independent of D. One way to see this is to imagine an algorithm that we might

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!