anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 In a problem with nonuniform misclassification costs, mc should be replaced by the cost of the actual errors the leaf is expected to make. These errors are obviously unknown to the learner. One solution is to estimate each error type separately using confidence intervals for multinomial distribution and multiply it by the associated cost: mcost � (l) = � U mul cf (ml, m i l , |C|) · mc. i�=c Such approach, however, would result in an overly pessimistic approximation, mainly when there are many classes. Alternatively, we compute the expected error as in the uniform case and propose replacing mc by the weighted average of the penalty for classifying an instance as c while it belongs to another class. m The weights are derived from the proportions i l ml−mc using a generalization of l Laplace’s law of succession (Good, 1965, chap. 4): mcost � (l) = EE(ml, ml − m c � � m l , cf) · i�=c i l + 1 ml − mc � · Mc,i . l + |C| − 1 Note that in a problem with C classes, the average is over C − 1 possible penalties because Mc,c = 0. Hence, in a problem with two classes c1, c2 if a leaf is marked as c1, mc would be replaced by M1,2. When classifying a new instance, the expected misclassification cost of a tree T built from m examples is the sum of the expected misclassification costs in the leaves divided by m: mcost(T) � = 1 � mcost(l), � m where L is the set of leaves in T. Hence, the expected total cost of T when classifying a single instance is: total(T, � E) = tcost(T, � E) + mcost(T). � An alternative approach that we intend to explore in future work is to estimate the cost of the sampled trees using the cost for a set-aside validation set. This approach is attractive mainly when the training set is large and one can afford setting aside a significant part of it. 4.3 Choosing a Split Having decided about the sampler and the tree utility function, we are ready to formalize the tree growing phase in ACT. A tree is built top-down. The procedure for selecting a splitting test at each node is listed in Figure 4.2 and illustrated in Figure 4.3. We give a detailed example of how ACT chooses splits and explain how the split selection procedure is modified for numeric attributes. 74 l∈L
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Procedure ACT-Choose-Attribute(E, A, r) If r = 0 Return EG2-Choose-Attribute(E, A) Foreach a ∈ A Foreach vi ∈ domain(a) Ei ← {e ∈ E | a(e) = vi} T ← EG2(a, Ei, A − {a}) mini ← Total-Cost(T, Ei) Repeat r − 1 times T ← SEG2(a, Ei, A − {a}) mini ← min (mini,Total-Cost(T, Ei)) totala ← Cost(a) + � |domain(a)| i=1 mini Return a for which totala is minimal Figure 4.2: Procedure for Attribute selection in ACT cost(EG2) =4.7 cost(SEG2) =5.1 a cost(SEG2) =4.9 cost(EG2) =8.9 Figure 4.3: Attribute evaluation in ACT. Assume that the cost of a in the current context is 0.4. The estimated cost of a subtree rooted at a is therefore 0.4 + min(4.7, 5.1) + min(8.9, 4.9) = 10. 4.3.1 Choosing a Split: Illustrative Examples ACT’s evaluation is cost-sensitive both in that it considers test and error costs simultaneously and in that it can take into account different error penalties. To illustrate this let us consider a two-class problem with mc = 100$ (uniform) and 6 attributes, a1, . . .,a6, whose costs are 10$. The training data contains 400 examples, out of which 200 are positive and 200 are negative. Assume that we have to choose between a1 and a2, and that r = 1. Let the trees in Figure 4.4, denoted T1 and T2, be those sampled for a1 and a2 75
Page 1 and 2:
Technion - Computer Science Departm
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40: Technion - Computer Science Departm
Page 89: Technion - Computer Science Departm
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

Create successful ePaper yourself

Delete template?

Save as template?