18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

wrong greedy decisions. When there are sufficient examples, wrong splits do<br />

not significantly diminish accuracy, but result in less comprehensible trees and<br />

costlier classification. When there are fewer examples the problem is intensified<br />

because the accuracy is adversely affected. In our proposed <strong>anytime</strong> approach,<br />

this problem is tackled by allotting additional time <strong>for</strong> reaching trees unexplored<br />

by greedy <strong>algorithms</strong> with pruning. There<strong>for</strong>e, we view pruning as orthogonal<br />

to lookahead and suggest considering it as an additional phase in order to avoid<br />

overfitting.<br />

Pruning can be applied in two different ways. First, it can be applied as a<br />

second phase after each LSID3 final tree has been built. We denote this extension<br />

by LSID3-p (and BLSID3-p <strong>for</strong> binary splits). Second, it is worthwhile to examine<br />

pruning of the lookahead trees. For this purpose, a post-pruning phase will be<br />

applied to the trees that were generated by SID3 to <strong>for</strong>m the lookahead sample,<br />

and the size of the pruned trees will be considered. The expected contribution<br />

of pruning in this case is questionable since it has a similar effect on the entire<br />

lookahead sample.<br />

Previous comparative studies did not find a single pruning method that is<br />

generally the best and conclude that different pruning techniques behave similarly<br />

(Esposito, Malerba, & Semeraro, 1997; Oates & Jensen, 1997). There<strong>for</strong>e, in our<br />

experiments we adopt the same pruning method as in C4.5, namely error-based<br />

pruning (EPB), and examine post-pruning of the final trees as well as pruning of<br />

the lookahead trees.<br />

Figure 3.10 describes the spaces of decision trees obtainable by LSID3. Let<br />

A be a set of attributes and E be a set of examples. The space of decision trees<br />

over A can be partitioned into two: the space of trees consistent with E and the<br />

space of trees inconsistent with E. The space of consistent trees includes as a<br />

subspace the trees inducible by TDIDT. 7 The trees that ID3 can produce are a<br />

subspace of the TDIDT space. Pruning extends the space of ID3 trees to the<br />

inconsistent space. LSID3 can reach trees that are obtainable by TDIDT but not<br />

by the greedy methods and their extensions. LSID3-p expands the search space<br />

to trees that do not fit the training data perfectly, so that noisy training sets can<br />

be handled.<br />

3.5.4 Mapping Contract Time to Sample Size<br />

LSID3 is parameterized by r, the number of times SID3 is invoked <strong>for</strong> each<br />

candidate. In real-life applications, however, the user supplies the system with<br />

7 TDIDT is a strict subspace because it cannot induce trees that: (1) contain a subtree with<br />

all leaves marked with the same class, and (2) include an internal node that has no associated<br />

training examples.<br />

33

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!