18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Chapter 3<br />

Contract Anytime Learning of<br />

Accurate Trees<br />

Assume that a medical center has decided to use medical records of previous<br />

patients in order to build an automatic diagnostic system <strong>for</strong> a particular disease.<br />

The center applies the C4.5 algorithm on thousands of records, and after few<br />

seconds receives a decision tree. During the coming months, or even years, the<br />

same induced decision tree will be used to predict whether patients have or do<br />

not have the disease. Obviously, the medical center is willing to wait much longer<br />

to obtain a better tree—either more accurate or more comprehensible.<br />

Consider also a planning agent that has to learn a decision tree from a given<br />

set of examples, while the time at which the model will be needed by the agent is<br />

not known in advance. In this case, the agent would like the <strong>learning</strong> procedure<br />

to learn the best tree it can until it is interrupted and queried <strong>for</strong> a solution.<br />

In both of the above scenarios, the <strong>learning</strong> algorithm is expected to exploit<br />

additional time allocation to produce a better tree. In the first case, the additional<br />

time is allocated in advance. In the second, it is not. Similar resource-bounded<br />

reasoning situations may occur in many real-life applications such as game playing,<br />

planning, stock trading and e-mail filtering. In this Chapter, we introduce a<br />

framework <strong>for</strong> exploiting extra time, preallocated or not, in order to learn smaller<br />

and more accurate trees (Esmeir & Markovitch, 2007b).<br />

3.1 Top-down Induction of Decision Trees<br />

TDIDT (top-down induction of decision trees) methods start from the entire set<br />

of training examples, partition it into subsets by testing the value of an attribute,<br />

and then recursively call the induction algorithm <strong>for</strong> each subset. Figure 3.1<br />

<strong>for</strong>malizes the basic algorithm <strong>for</strong> TDIDT. We focus first on consistent trees <strong>for</strong><br />

19

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!