anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 on cost-sensitive induction, we adopt the model described by Turney (1995). In a problem with |C| different classes, a misclassification cost matrix M is a |C|×|C| matrix whose Mi,j entry defines the penalty of assigning the class ci to an instance that actually belongs to the class cj. Let Θ(h, e) be the set of tests required by the hypothesis h. We denote by cost(θ) the cost of administering the test θ. The testing cost of e in h is therefore tcost(h, e) = � θ∈Θ cost(θ). Note that we use sets notation because tests that appear several times are charged for only once. In addition, the model described by Turney (1995) handles two special test types, namely grouped and delayed tests. Grouped Tests. Some tests share a common cost, for which we would like to charge only once. Typically, tests also has an extra (possibly different) cost. For example, consider a tree path with tests like cholesterol level and glucose level. For both values to be measured, a blood test is needed. Taking blood samples to measure the cholesterol level clearly lowers the cost of measuring the glucose level. Formally, each test possibly belongs to a group. If it’s the first test from the group to be administered, we charge for the full cost. If another test from the same group has already been administered earlier in the decision path, we charge only for the marginal cost. Delayed Tests. Sometimes the outcome of a test cannot be obtained immediately, e.g., lab test results. Such tests, called delayed tests, force us to wait until the outcome is available. Alternatively, Turney (1995) suggests taking into account all possible outcomes: when a delayed test is encountered, all the tests in the subtree under it are administered and charged for. Once the result of the delayed test is available, the prediction is at hand. One problem with this setup is that it follows all paths in the subtree, regardless of the outcome of non-delayed costs. Moreover, it is not possible to distinguish between the delays different tests impose: for example, one result might be ready after several minutes while another only after a few days. In this work we do not handle delayed tests, but we do explain how our learner can be modified to take them into account. The total cost of classification is the sum of the miscalculation cost and the testing cost. Cost-sensitive classifiers, which aim to reduce the total cost of classification, require Mi,j and cost(θ) to be given in the same currency. A more general model would require a utility function that combines both types, for example, by eliciting user preferences (Lenert & Soetikno, 1997). An important property of the aforementioned cost-sensitive setup is that maximizing generalization accuracy, which is the goal of most existing learners, can be viewed as a special case: when accuracy is the only objective, test costs are ignored and misclassification cost is uniform. 14
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Learning Classification $$ $$$$ $$ $$$$ Training Data - + - + - - - + + - - - - + + - + + - contract learning pre-contract classification contract classification New case ? Anytime Learner Anycost Classifier classification $$$$ label interruptible learning interruptible Figure 2.1: Overview of our proposed framework. The training examples are provided to an anytime learner that can trade computation speed for better classifiers. The learning time, ρ l , can be preallocated (contract learning), or not (interruptible learning). Similarly, the maximal classification budget, ρ c , may be provided to the learner (pre-contract classification), become available right before classification (contract classification), or be unknown, in which case the classifier should utilize additional resources until interrupted (interruptible classification). While minimizing the total cost of classification is appropriate for many tasks, many environments involve strict constraints on the testing costs that a learner can use. To operate in such environments, the classifier needs to do its best without exceeding the bounds. Therefore, classifiers that focus on reducing the total cost may be inappropriate. Like the bound on the learning process, the bounds on classification costs can be preallocated and given to the learner (precontract classification), known to the classifier but not to the learner (contract classification), or determined during classification (interruptible classification). Note that unlike cost-sesitive learners that attempt to minimize the total cost of classification, bounded classification does not require MC and ρ c to be provided in the same currency. This allows us to operate in environments where it is not possible to convert from one scale to another, for example time to money or money to risk. Figure 2.1 gives an overview of the different possible scenarios for learning and classification. More formally, let E be the set of training examples, A be the set of attributes, AC the attribute costs, and M be the misclassification cost matrix, and h the learned hypothesis. Further, let ρ l be the allocation for learning resources and ρ c denote the limit on classification resources for classifying a case 15 $$
Page 1 and 2: Technion - Computer Science Departm
Page 29: Technion - Computer Science Departm
Page 81 and 82:
Technion - Computer Science Departm
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

Create successful ePaper yourself

Delete template?

Save as template?