anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
Misclassification cost<br />
Misclassification cost<br />
70<br />
65<br />
60<br />
55<br />
50<br />
45<br />
40<br />
35<br />
C4.5<br />
Uni(r=0,k=16)<br />
Uni(r=3,k=16)<br />
Hill(r=3,k=16)<br />
30<br />
0 50 100 150 200 250 300<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
Maximal classification cost<br />
C4.5<br />
Uni(r=0,k=16)<br />
Uni(r=3,k=16)<br />
Hill(r=3,k=16)<br />
0<br />
0 50 100 150 200 250<br />
Maximal classification cost<br />
Misclassification cost<br />
Misclassification cost<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
C4.5<br />
Uni(r=0,k=16)<br />
Uni(r=3,k=16)<br />
Hill(r=3,k=16)<br />
0<br />
0 50 100 150 200 250 300 350<br />
85<br />
80<br />
75<br />
70<br />
65<br />
60<br />
55<br />
50<br />
45<br />
40<br />
35<br />
Maximal classification cost<br />
C4.5<br />
Uni(r=0,k=16)<br />
Uni(r=3,k=16)<br />
Hill(r=3,k=16)<br />
0 50 100 150 200 250<br />
Maximal classification cost<br />
Figure 5.14: Results <strong>for</strong> contract classification: the misclassification cost as a function<br />
of the preallocated testing costs contract <strong>for</strong> Glass (upper-left), AND-OR (upperright),<br />
MULTI-XOR (lower-left) and KRK (lower-right).<br />
cost-insensitive C4.5.<br />
It is easy to see that across all 4 domains Uni- and Hill-TATA(r = 3) are<br />
dominant. Uni<strong>for</strong>m-TATA(r = 0) is better than C4.5 when the provided contracts<br />
are low. When the contracts can af<strong>for</strong>d using all the attributes, both <strong>algorithms</strong><br />
per<strong>for</strong>m similarly. In comparison to Uni<strong>for</strong>m-TATA(r = 0), the anycost behavior<br />
of Uni<strong>for</strong>m-TATA(r = 3) is better: it is monotonic and utilizes testing resources<br />
better.<br />
The differences in per<strong>for</strong>mance between Uni<strong>for</strong>m- and Hill-TATA(r = 3) are<br />
interesting. While both <strong>algorithms</strong> exhibit similar trends, Hill-TATA reaches<br />
better results slightly earlier than Uni<strong>for</strong>m-TATA on 3 out of the 4 domains<br />
(with the exception of KRK). The reason is that Hill-TATA selects the series<br />
of ρ c ’s heuristically, rather than by means of blind uni<strong>for</strong>m gaps. As a result,<br />
it can focus on cost ranges where it is worthwhile to build more trees. These<br />
differences are expected to diminish when the repertoires are larger, which enables<br />
Uni<strong>for</strong>m-TATA to cover more contracts. To verify this hypothesis, we repeated<br />
the experiments with k = 32 and indeed the per<strong>for</strong>mance differences between the<br />
116