anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
wrong greedy decisions. When there are sufficient examples, wrong splits do<br />
not significantly diminish accuracy, but result in less comprehensible trees and<br />
costlier classification. When there are fewer examples the problem is intensified<br />
because the accuracy is adversely affected. In our proposed <strong>anytime</strong> approach,<br />
this problem is tackled by allotting additional time <strong>for</strong> reaching trees unexplored<br />
by greedy <strong>algorithms</strong> with pruning. There<strong>for</strong>e, we view pruning as orthogonal<br />
to lookahead and suggest considering it as an additional phase in order to avoid<br />
overfitting.<br />
Pruning can be applied in two different ways. First, it can be applied as a<br />
second phase after each LSID3 final tree has been built. We denote this extension<br />
by LSID3-p (and BLSID3-p <strong>for</strong> binary splits). Second, it is worthwhile to examine<br />
pruning of the lookahead trees. For this purpose, a post-pruning phase will be<br />
applied to the trees that were generated by SID3 to <strong>for</strong>m the lookahead sample,<br />
and the size of the pruned trees will be considered. The expected contribution<br />
of pruning in this case is questionable since it has a similar effect on the entire<br />
lookahead sample.<br />
Previous comparative studies did not find a single pruning method that is<br />
generally the best and conclude that different pruning techniques behave similarly<br />
(Esposito, Malerba, & Semeraro, 1997; Oates & Jensen, 1997). There<strong>for</strong>e, in our<br />
experiments we adopt the same pruning method as in C4.5, namely error-based<br />
pruning (EPB), and examine post-pruning of the final trees as well as pruning of<br />
the lookahead trees.<br />
Figure 3.10 describes the spaces of decision trees obtainable by LSID3. Let<br />
A be a set of attributes and E be a set of examples. The space of decision trees<br />
over A can be partitioned into two: the space of trees consistent with E and the<br />
space of trees inconsistent with E. The space of consistent trees includes as a<br />
subspace the trees inducible by TDIDT. 7 The trees that ID3 can produce are a<br />
subspace of the TDIDT space. Pruning extends the space of ID3 trees to the<br />
inconsistent space. LSID3 can reach trees that are obtainable by TDIDT but not<br />
by the greedy methods and their extensions. LSID3-p expands the search space<br />
to trees that do not fit the training data perfectly, so that noisy training sets can<br />
be handled.<br />
3.5.4 Mapping Contract Time to Sample Size<br />
LSID3 is parameterized by r, the number of times SID3 is invoked <strong>for</strong> each<br />
candidate. In real-life applications, however, the user supplies the system with<br />
7 TDIDT is a strict subspace because it cannot induce trees that: (1) contain a subtree with<br />
all leaves marked with the same class, and (2) include an internal node that has no associated<br />
training examples.<br />
33