18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Test costs<br />

a1<br />

a2<br />

a3<br />

a4<br />

a5<br />

a6<br />

10<br />

10<br />

10<br />

10<br />

10<br />

10<br />

MC costs<br />

FP<br />

FN<br />

100<br />

100<br />

r = 1<br />

T1 T2<br />

5<br />

95<br />

+<br />

-<br />

EE = 7.3<br />

a3<br />

- +<br />

a1<br />

a5<br />

- +<br />

95 + 5 + 95 +<br />

5 - 95 - 5 -<br />

EE = 7.3 EE = 7.3 EE = 7.3<br />

mcost (T1) = 7.3*4*100$ / 400 = 7.3$<br />

tcost (T1) = 20$<br />

total(T1) = 27.3$<br />

0<br />

50<br />

+<br />

-<br />

EE = 1.4<br />

a4<br />

- +<br />

a2<br />

a6<br />

- +<br />

100 + 0 + 100 +<br />

50 - 50 - 50 -<br />

EE = 54.1 EE = 1.4 EE = 54.1<br />

mcost (T2) = (1.4*2 + 54.1*2) * 100$ / 400 = 27.7$<br />

tcost (T2) = 20$<br />

total (T2) = 47.7$<br />

Figure 4.4: Evaluation of tree samples in ACT. The leftmost column defines the<br />

costs: 6 attributes with identical cost and uni<strong>for</strong>m error penalties. T1 was sampled<br />

<strong>for</strong> a1 and T2 <strong>for</strong> a2. EE stands <strong>for</strong> the expected error. Because the total cost of<br />

T1 is lower, ACT would prefer to split on a1.<br />

Test costs<br />

a1<br />

a2<br />

a3<br />

a4<br />

a5<br />

a6<br />

40<br />

10<br />

10<br />

10<br />

10<br />

10<br />

MC costs<br />

FP<br />

FN<br />

100<br />

100<br />

r = 1<br />

T1 T2<br />

a3<br />

- +<br />

a1<br />

5 + 95 + 5 + 95 +<br />

95 - 5 - 95 - 5 -<br />

EE = 7.3 EE = 7.3 EE = 7.3 EE = 7.3<br />

mcost (T1) = 7.3*4*100$ / 400 = 7.3$<br />

tcost (T1) = 50$<br />

total(T1) = 57.3$<br />

a5<br />

- +<br />

a4<br />

- +<br />

a2<br />

0 + 100 + 0 + 100 +<br />

50 - 50 - 50 - 50 -<br />

EE = 1.4 EE = 54.1 EE = 1.4 EE = 54.1<br />

a6<br />

- +<br />

mcost (T2) = (1.4*2 + 54.1*2) * 100$ / 400 = 27.7$<br />

tcost (T2) = 20$<br />

total (T2) = 47.7$<br />

Figure 4.5: Evaluation of tree samples in ACT. The leftmost column defines the<br />

costs: 6 attributes with identical cost (except <strong>for</strong> the expensive a1) and uni<strong>for</strong>m error<br />

penalties. T1 was sampled <strong>for</strong> a1 and T2 <strong>for</strong> a2. Because the total cost of T2 is<br />

lower, ACT would prefer to split on a2.<br />

respectively. The expected error costs of T1 and T2 are: 1<br />

mcost(T1) � = 1<br />

4 · 7.3<br />

(4 · EE (100, 5, 0.25)) · 100$ = · 100$ = 7.3$<br />

400 400<br />

mcost(T2) � = 1<br />

(2 · EE (50, 0, 0.25) · 100$ + 2 · EE (150, 50, 0.25) · 100$)<br />

400<br />

= 2 · 1.4 + 2 · 54.1<br />

· 100$ = 27.7$<br />

400<br />

When both test and error costs are involved, ACT considers their sum. Since<br />

the test cost of both trees is identical (20$), ACT would prefer to split on a1.<br />

1 In this example we set cf to 0.25, as in C4.5. In Section 4.5 we discuss how to tune cf.<br />

76

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!