anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
Table 5.1: Characteristics of the datasets used to evaluate TATA<br />
Attributes Max<br />
Dataset Instances Nom. (bin.) Num. domain Classes<br />
Breast Cancer 277 9 (3) 0 13 2<br />
Bupa 345 0 (0) 5 - 2<br />
Car 1728 6 (0) 0 4 4<br />
Flare 323 10 (5) 0 7 4<br />
Glass 214 0 (0) 9 - 7<br />
Heart 296 8(4) 5 4 2<br />
Hepatitis 154 13(13) 6 2 2<br />
Iris 150 0 (0) 4 - 3<br />
KRK 28056 6(0) 0 8 17<br />
Monks-1 124+432 6 (2) 0 4 2<br />
Monks-2 169+432 6 (2) 0 4 2<br />
Monks-3 122+432 6 (2) 0 4 2<br />
Multiplexer-20 615 20 (20) 0 2 2<br />
Multi-XOR 200 11 (11) 0 2 2<br />
Multi-AND-OR 200 11 (11) 0 2 2<br />
Nursery 8703 8(8) 0 5 5<br />
Pima 768 0(0) 8 - 2<br />
TAE 151 4(1) 1 26 3<br />
Tic-Tac-Toe 958 9 (0) 0 3 2<br />
Titanic 2201 3(2) 0 4 2<br />
Thyroid 3772 15(15) 5 2 3<br />
Voting 232 16 (16) 0 2 2<br />
Wine 178 0 (0) 13 - 3<br />
XOR 3D 200 0 (0) 6 - 2<br />
XOR-5 200 10 (10) 0 2 2<br />
administer any test and thus their per<strong>for</strong>mance is identical. At the other end,<br />
when ρ ≥ ρc max , the attribute costs are actually not a constraint. In this case<br />
TATA(r = 5) per<strong>for</strong>med best, confirming the results reported in Chapter 4 when<br />
misclassification costs were dominant. The more interesting ρc values are those<br />
in between. Table 5.2 lists the normalized area under the misclassification cost<br />
curve over the range [33%−99%]ρc max. Confirming the curves, the results indicate<br />
that TATA(r = 5) has the best overall per<strong>for</strong>mance. The Wilcoxon test (Demsar,<br />
2006), which compares <strong>classifiers</strong> over multiple datasets, finds TATA(r = 5) to<br />
be significantly better than all the other <strong>algorithms</strong>.<br />
As expected, all five <strong>algorithms</strong> improve with the increase in ρc because they<br />
can use more features. For ρc values slightly larger than ρc min we can see that EG2,<br />
which is cost-sensitive, per<strong>for</strong>ms better than C4.5. The reason is that EG2 takes<br />
into account attribute costs and hence will prefer lower cost attributes. With the<br />
increase in ρc and the relaxation of cost constraints, C4.5 becomes better than<br />
EG2.<br />
112