anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
Test costs<br />
a1<br />
a2<br />
a3<br />
a4<br />
a5<br />
a6<br />
10<br />
10<br />
10<br />
10<br />
10<br />
10<br />
MC costs<br />
FP<br />
FN<br />
100<br />
100<br />
r = 1<br />
T1 T2<br />
5<br />
95<br />
+<br />
-<br />
EE = 7.3<br />
a3<br />
- +<br />
a1<br />
a5<br />
- +<br />
95 + 5 + 95 +<br />
5 - 95 - 5 -<br />
EE = 7.3 EE = 7.3 EE = 7.3<br />
mcost (T1) = 7.3*4*100$ / 400 = 7.3$<br />
tcost (T1) = 20$<br />
total(T1) = 27.3$<br />
0<br />
50<br />
+<br />
-<br />
EE = 1.4<br />
a4<br />
- +<br />
a2<br />
a6<br />
- +<br />
100 + 0 + 100 +<br />
50 - 50 - 50 -<br />
EE = 54.1 EE = 1.4 EE = 54.1<br />
mcost (T2) = (1.4*2 + 54.1*2) * 100$ / 400 = 27.7$<br />
tcost (T2) = 20$<br />
total (T2) = 47.7$<br />
Figure 4.4: Evaluation of tree samples in ACT. The leftmost column defines the<br />
costs: 6 attributes with identical cost and uni<strong>for</strong>m error penalties. T1 was sampled<br />
<strong>for</strong> a1 and T2 <strong>for</strong> a2. EE stands <strong>for</strong> the expected error. Because the total cost of<br />
T1 is lower, ACT would prefer to split on a1.<br />
Test costs<br />
a1<br />
a2<br />
a3<br />
a4<br />
a5<br />
a6<br />
40<br />
10<br />
10<br />
10<br />
10<br />
10<br />
MC costs<br />
FP<br />
FN<br />
100<br />
100<br />
r = 1<br />
T1 T2<br />
a3<br />
- +<br />
a1<br />
5 + 95 + 5 + 95 +<br />
95 - 5 - 95 - 5 -<br />
EE = 7.3 EE = 7.3 EE = 7.3 EE = 7.3<br />
mcost (T1) = 7.3*4*100$ / 400 = 7.3$<br />
tcost (T1) = 50$<br />
total(T1) = 57.3$<br />
a5<br />
- +<br />
a4<br />
- +<br />
a2<br />
0 + 100 + 0 + 100 +<br />
50 - 50 - 50 - 50 -<br />
EE = 1.4 EE = 54.1 EE = 1.4 EE = 54.1<br />
a6<br />
- +<br />
mcost (T2) = (1.4*2 + 54.1*2) * 100$ / 400 = 27.7$<br />
tcost (T2) = 20$<br />
total (T2) = 47.7$<br />
Figure 4.5: Evaluation of tree samples in ACT. The leftmost column defines the<br />
costs: 6 attributes with identical cost (except <strong>for</strong> the expensive a1) and uni<strong>for</strong>m error<br />
penalties. T1 was sampled <strong>for</strong> a1 and T2 <strong>for</strong> a2. Because the total cost of T2 is<br />
lower, ACT would prefer to split on a2.<br />
respectively. The expected error costs of T1 and T2 are: 1<br />
mcost(T1) � = 1<br />
4 · 7.3<br />
(4 · EE (100, 5, 0.25)) · 100$ = · 100$ = 7.3$<br />
400 400<br />
mcost(T2) � = 1<br />
(2 · EE (50, 0, 0.25) · 100$ + 2 · EE (150, 50, 0.25) · 100$)<br />
400<br />
= 2 · 1.4 + 2 · 54.1<br />
· 100$ = 27.7$<br />
400<br />
When both test and error costs are involved, ACT considers their sum. Since<br />
the test cost of both trees is identical (20$), ACT would prefer to split on a1.<br />
1 In this example we set cf to 0.25, as in C4.5. In Section 4.5 we discuss how to tune cf.<br />
76