18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

The SVM algorithm usually depends on several parameters – kernel parameters,<br />

<strong>for</strong> example. Several works, such as (Chapelle, Vapnik, Bousquet, & Mukherjee,<br />

2002), proposed iterative methods <strong>for</strong> automatic tuning of SVM parameters.<br />

These iterative methods can exploit additional time resources <strong>for</strong> better tuning.<br />

A well-studied alternative to inductive <strong>learning</strong> is the theory refinement<br />

paradigm. In a theory refinement system, we first acquire a domain theory,<br />

<strong>for</strong> instance by querying experts, and then revise the obtained set of rules in an<br />

attempt to make it consistent with the training data. Opitz (1995) introduced<br />

an <strong>anytime</strong> approach <strong>for</strong> theory refinement. This approach starts by generating<br />

a neural network from a set of rules that describe what is currently known about<br />

the domain. The network then uses the training data and the additional time<br />

resources to try to improve the resulting hypothesis.<br />

6.3 Cost-sensitive Classification<br />

Cost-sensitive trees have been the subject of many research ef<strong>for</strong>ts. Several<br />

works proposed <strong>learning</strong> <strong>algorithms</strong> that consider different misclassification costs<br />

(Breiman et al., 1984; Pazzani, Merz, Murphy, Ali, Hume, & Brunk, 1994;<br />

Provost & Buchanan, 1995; Brad<strong>for</strong>d, Kunz, Kohavi, Brunk, & Brodley, 1998;<br />

Domingos, 1999b; Drummond & Holte, 2000; Elkan, 2001; Zadrozny, Lang<strong>for</strong>d,<br />

& Abe, 2003; Lachiche & Flach, 2003; Abe, Zadrozny, & Lang<strong>for</strong>d, 2004; Vadera,<br />

2005; Margineantu, 2005; Zhu, Wu, Khoshgoftaar, & Yong, 2007; Sheng & Ling,<br />

2007b). These methods, however, do not consider test costs and hence are appropriate<br />

mainly <strong>for</strong> domains where test costs are not a constraint. Other authors<br />

designed tree learners that take into account test costs, such as IDX (Norton,<br />

1989), CSID3 (Tan & Schlimmer, 1989), and EG2 (Nunez, 1991). These methods,<br />

however, do not consider misclassification costs.<br />

Decision Trees with Minimal Cost (DTMC), a greedy method that attempts<br />

to minimize both types of costs simultaneously, has been recently introduced<br />

(Ling et al., 2004; Sheng et al., 2006). A tree is built top-down, and a greedy<br />

split criterion that takes into account both testing and misclassification costs is<br />

used. The basic idea is to estimate the immediate reduction in total cost after<br />

each split, and to prefer the split with the maximal reduction. If no split reduces<br />

the cost on the training data, the induction process is stopped.<br />

Although efficient, the DTMC approach can be trapped into a local minimum<br />

and produce trees that are not globally optimal. For example, consider the concept<br />

and costs described in Figure 6.2 (left). There are 10 attributes, of which<br />

only a9 and a10 are relevant. The cost of a9 and a10, however, is significantly<br />

higher than the others. Such high costs may hide the usefulness of a9 and a10,<br />

and mislead the learner into repeatedly splitting on a1−8, which would result in<br />

128

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!