23.02.2015 Views

Machine Learning - DISCo

Machine Learning - DISCo

Machine Learning - DISCo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

(e.g., neural network) fits the data based solely on the description of the<br />

hypothesis and data, independent of the task domain under study. In contrast,<br />

this formulation allows the domain-specific background information B to<br />

become part of the definition of "fit." In particular, h fits the training example<br />

(xi, f (xi)) as long as f (xi) follows deductively from B A h A xi.<br />

0 By incorporating background information B, this formulation invites learning<br />

methods that use this background information to guide the search for h,<br />

rather than merely searching the space of syntactically legal hypotheses.<br />

The inverse resolution procedure described in the following section uses<br />

background knowledge in this fashion.<br />

At the same time, research on inductive logic programing following this<br />

formulation has encountered several practical difficulties.<br />

a The requirement @'(xi, f (xi)) E D) (B A h A xi) t f (xi) does not naturally<br />

accommodate noisy training data. The problem is that this expression does<br />

not allow for the possibility that there may be errors in the observed description<br />

of the instance xi or its target value f (xi). Such errors can produce<br />

an inconsistent set of constraints on h. Unfortunately, most formal logic<br />

frameworks completely lose their ability to distinguish between truth and<br />

falsehood once they are given inconsistent sets of assertions.<br />

0 The language of first-order logic is so expressive, and the number of hypotheses<br />

that satisfy (V(xi, f (xi)) E D) (B A h A xi) t f (xi) is SO large,<br />

that the search through the space of hypotheses is intractable in the general<br />

case. Much recent work has sought restricted forms of first-order expressions,<br />

or additional second-order knowledge, to improve the tractability of<br />

the hypothesis space search.<br />

0 Despite our intuition that background knowledge B should help constrain<br />

the search for a hypothesis, in most ILP systems (including all discussed<br />

in this chapter) the complexity of the hypothesis space search increases as<br />

background knowledge B is increased. (However, see Chapters 11 and 12 for<br />

algorithms that use background knowledge to decrease rather than increase<br />

sample complexity).<br />

In the following section, we examine one quite general inverse entailment<br />

operator that constructs hypotheses by inverting a deductive inference rule.<br />

10.7 INVERTING RESOLUTION<br />

A general method for automated deduction is the resolution rule introduced by<br />

Robinson (1965). The resolution rule is a sound and complete rule for deductive<br />

inference in first-order logic. Therefore, it is sensible to ask whether we can invert<br />

the resolution rule to form an inverse entailment operator. The answer is yes, and<br />

it is just this operator that forms the basis of the CIGOL program introduced by<br />

Muggleton and Buntine (1988).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!