anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
LSID3-p Tree Size / ID3 Tree Size<br />
LSID3-p Accuracy<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8<br />
100<br />
90<br />
80<br />
70<br />
60<br />
50<br />
C4.5 Tree Size / ID3 Tree Size<br />
(a)<br />
40<br />
40 50 60 70 80 90 100<br />
C4.5 Accuracy<br />
(b)<br />
Figure 3.22: Per<strong>for</strong>mance differences <strong>for</strong> LSID3-p(5) and C4.5. The upper figure<br />
compares the size of trees induced by each algorithm, measured relative to ID3 (in<br />
percents). The lower figure plots the absolute differences in terms of accuracy.<br />
Binary Splits<br />
By default, LSID3 uses multiway splits, i.e., it builds a subtree <strong>for</strong> each possible<br />
value of a nominal attribute. Following the discussion in Section 3.5.2, we also<br />
tested how LSID3 per<strong>for</strong>ms if binary splits are <strong>for</strong>ced. The tests in this case are<br />
found using exhaustive search.<br />
To demonstrate the fragmentation problem, we used two datasets. The first<br />
dataset is Tic-tac-toe. When binary splits were <strong>for</strong>ced, the per<strong>for</strong>mance of both<br />
C4.5 and LSID3-p improved from 85.8 and 87.2 to 94.1 and 94.8 respectively. As<br />
in the case of multiway splits, the advantage of LSID3-p over C4.5 is statistically<br />
53