18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Table 3.1: Possible training set <strong>for</strong> Learning the 2-XOR concept a1 ⊕ a2, where a3<br />

and a4 are irrelevant<br />

a1 a2 a3 a4 label<br />

1 0 0 1 +<br />

0 1 0 0 +<br />

0 0 0 0 -<br />

1 1 0 0 -<br />

0 1 1 1 +<br />

0 0 1 1 -<br />

1 0 1 1 +<br />

a2<br />

- +<br />

a1<br />

-<br />

a4<br />

Figure 3.2: The resulted tree if ID3 is invoked on the training set from Table 3.1<br />

that while random splitting leads to inferior trees, the in<strong>for</strong>mation gain and Gini<br />

index measures are statistically indistinguishable.<br />

The top-down methodology has the advantage of evaluating a potential attribute<br />

<strong>for</strong> a split in the context of the attributes associated with the nodes<br />

above it. The local greedy measures, however, consider each of the remaining<br />

attributes independently, ignoring potential interaction between different attributes<br />

(Mingers, 1989; Kononenko et al., 1997; Kim & Loh, 2001). We refer<br />

to the family of <strong>learning</strong> tasks where the utility of a set of attributes cannot<br />

be recognized by examining only subsets of it as tasks with a strong interdependency.<br />

Jakulin and Bratko (2003) studied the detection of attribute interactions<br />

in machine <strong>learning</strong> and proposed using the interaction gain to measure attribute<br />

interdependencies.<br />

When <strong>learning</strong> a problem with a strong interdependency, greedy measures<br />

can lead to a choice of non-optimal splits. To illustrate the above, let us consider<br />

the 2-XOR problem a1 ⊕ a2 with two additional irrelevant attributes, a3 and a4.<br />

21<br />

a2<br />

- +<br />

a1<br />

+

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!