anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
Table 3.2: Characteristics of the datasets used<br />
Attributes Max<br />
Dataset Instances Nom. (bin.) Num. domain Classes<br />
Automobile - Make 160 10 (4) 15 8 22<br />
Automobile - Sym. 160 10 (4) 15 22 7<br />
Balance Scale 625 4 (0) 0 5 3<br />
Breast Cancer 277 9 (3) 0 13 2<br />
Connect-4 68557 42 (0) 0 3 3<br />
Corral 32 6 (6) 0 2 2<br />
Glass 214 0 (0) 9 - 7<br />
Iris 150 0 (0) 4 - 3<br />
Monks-1 124+432 6 (2) 0 4 2<br />
Monks-2 169+432 6 (2) 0 4 2<br />
Monks-3 122+432 6 (2) 0 4 2<br />
Mushroom 8124 22 (4) 0 12 2<br />
Solar Flare 323 10 (5) 0 7 4<br />
Tic-Tac-Toe 958 9 (0) 0 3 2<br />
Voting 232 16 (16) 0 2 2<br />
Wine 178 0 (0) 13 - 3<br />
Zoo 101 16 (15) 0 6 7<br />
Numeric XOR 3D 200 0 (0) 6 - 2<br />
Numeric XOR 4D 500 0 (0) 8 - 2<br />
Multiplexer-20 615 20 (20) 0 2 2<br />
Multiplex-XOR 200 11 (11) 0 2 2<br />
XOR-5 200 10 (10) 0 2 2<br />
XOR-5 10% noise 200 10 (10) 0 2 2<br />
XOR-10 10000 20 (20) 0 2 2<br />
ones. 10 Several commonly used UCI datasets were included (among the 17) to<br />
allow comparison with results reported in literature. Table 3.2 summarizes the<br />
characteristics of these datasets while Appendix A gives more detailed descriptions.<br />
To compare the per<strong>for</strong>mance of the different <strong>algorithms</strong>, we will consider<br />
two evaluation criteria over decision trees: (1) generalization accuracy, measured<br />
by the ratio of the correctly classified examples in the testing set, and (2) tree<br />
size, measured by the number of non-empty leaves in the tree.<br />
3.7.2 Fixed Time Comparison<br />
Our first set of experiments compares ID3, C4.5 with its default parameters, ID3k(k<br />
= 2), LSID3(r = 5) and LSID3-p(r = 5). We used our own implementation<br />
<strong>for</strong> all <strong>algorithms</strong>, where the results of C4.5 were validated with WEKA’s implementation<br />
(Witten & Frank, 2005). We first discuss the results <strong>for</strong> the consistent<br />
trees and continue by analyzing the findings when pruning is applied.<br />
10 The artificial datasets are available at http://www.cs.technion.ac.il/∼e<strong>saher</strong>/publications/datasets.<br />
46