18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Maimon, and Eberbach (2001) introduced an interruptible <strong>anytime</strong> algorithm <strong>for</strong><br />

feature selection. Their proposed method per<strong>for</strong>ms a hill-climbing <strong>for</strong>ward search<br />

in the space of feature subsets where the attributes are added to the selected<br />

subset one at a time. constructed from the previously selected attributes. The<br />

algorithm can be stopped at any moment, returning the set of currently selected<br />

features.<br />

All these methods are orthogonal to our induction methods and can complement<br />

them. An interesting research direction is to determine resource distribution<br />

between the various <strong>learning</strong> stages.<br />

Costs are also involved in the <strong>learning</strong> phase, during example acquisition and<br />

during model <strong>learning</strong>. The problem of budgeted <strong>learning</strong> has been studied by<br />

Lizotte et al. (2003). There is a cost associated with obtaining each attribute<br />

value of a training example, and the task is to determine what attributes to test<br />

given a budget.<br />

A related problem is active feature-value acquisition. In this setup one tries to<br />

reduce the cost of improving accuracy by identifying highly in<strong>for</strong>mative instances.<br />

Melville et al. (2004) introduced an approach in which instances are selected <strong>for</strong><br />

acquisition based on the accuracy of the current model and the model’s confidence<br />

in the prediction.<br />

An extension to the work on optimal fixed depth trees (Greiner et al., 2002)<br />

is when tests have costs, both during <strong>learning</strong> and classification. Kapoor and<br />

Greiner (2005) presented a budgeted <strong>learning</strong> framework that operates under<br />

strict budget <strong>for</strong> feature-value acquisition during <strong>learning</strong> and produces a classifier<br />

with a limited classification budget. Provost, Melville, and Saar-Tsechansky<br />

(2007) discussed the challenges electronic commerce environments bring to data<br />

acquisition during <strong>learning</strong> and classification. They discussed several settings and<br />

presented a unified framework to integrate acquisition of different types, with any<br />

cost structure and any predictive modeling objective.<br />

(Madani, Lizotte, & Greiner, 2004) studied the setup of budgeted active model<br />

selection. In this setup the learner can use a fixed budget of model probes to<br />

identify which of a given set of possible models has the highest expected accuracy.<br />

In each probe a model is evaluated on a random indistinguishable instance. The<br />

goal is a policy that sequentially determines which model to probe next, based on<br />

the in<strong>for</strong>mation observed so far. Madani et al. <strong>for</strong>malized the task and showed<br />

that it is NP-hard in general. They also studied several <strong>algorithms</strong> <strong>for</strong> budgeted<br />

active model selection and compared them empirically.<br />

133

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!