anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
well and no lookahead is needed. However, <strong>for</strong> more difficult concepts such as<br />
XOR, the greedy approach is likely to fail. A third problem is their limitedness<br />
to a specific objective: they cannot be adapted to different <strong>learning</strong> setups and<br />
other objectives, such as minimizing testing and misclassification costs.<br />
We there<strong>for</strong>e propose an alternative approach <strong>for</strong> looking ahead. For each<br />
candidate split we sample the spaces of subtrees under it and estimate the utility<br />
of the sampled trees. Because we evaluate entire trees, different utility functions<br />
can be used, depending on the actual cost scheme. The split with the best tree<br />
in its sample is then selected to split on.<br />
In the cost-insensitive setup, our goal is to induce small and accurate trees.<br />
Following Occam’s razor, we bias the sample towards small consistent trees and<br />
evaluate each sample tree by its size. To avoid overfitting the training examples,<br />
we apply a post-pruning phase, similarly to C4.5.<br />
When our objective is to minimize the total cost, we bias the sample towards<br />
low cost trees, and evaluate the sampled trees by their expected total cost. The<br />
total cost of a tree is estimated using the average costs of classifying the training<br />
examples using the tree, and the expected error of the tree. In cost-insensitive<br />
environments, the main goal of pruning is to simplify the tree in order to avoid<br />
overfitting the training data. A subtree is pruned if the resulting tree is expected<br />
to yield a lower error. When test costs are taken into account, pruning has another<br />
important role: reducing test costs in a tree. Keeping a subtree is worthwhile<br />
only if its expected reduction in misclassification costs is larger than the cost of<br />
the tests in that subtree. There<strong>for</strong>e, we designed a novel pruning approach based<br />
on the expected total cost of a tree.<br />
For the scenarios that constrain the testing costs, we developed a novel topdown<br />
approach to exploit the available testing resources. When the bounds are<br />
known to the learner, a tree that fits the budget is built. In other cases, a<br />
repertoire of trees is <strong>for</strong>med. If the quota is known be<strong>for</strong>e classification, a single<br />
tree that best fits the budget is picked. Otherwise, the trees are traversed until<br />
resources are exhausted.<br />
Our <strong>anytime</strong> approach can benefit from extra <strong>learning</strong> time by creating larger<br />
samples. The larger the samples are, the more accurate the attribute evaluation<br />
is. There are two main classes of <strong>anytime</strong> <strong>algorithms</strong>, namely contract and interruptible<br />
(Russell & Zilberstein, 1996). A contract algorithm is one that gets<br />
its resource allocation as a parameter. An interruptible algorithm is one whose<br />
resource allocation is not given in advance and thus must be prepared to be interrupted<br />
at any moment. While the assumption of preallocated resources holds<br />
<strong>for</strong> many induction tasks, in many other real-life applications it is not possible to<br />
allocate the resources a priori. There<strong>for</strong>e, in our work, we are interested both in<br />
contract and interruptible decision tree learners. In the contract setup, the sample<br />
size is predetermined according to the available resources. In the interruptible<br />
9