anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
Chapter 3<br />
Contract Anytime Learning of<br />
Accurate Trees<br />
Assume that a medical center has decided to use medical records of previous<br />
patients in order to build an automatic diagnostic system <strong>for</strong> a particular disease.<br />
The center applies the C4.5 algorithm on thousands of records, and after few<br />
seconds receives a decision tree. During the coming months, or even years, the<br />
same induced decision tree will be used to predict whether patients have or do<br />
not have the disease. Obviously, the medical center is willing to wait much longer<br />
to obtain a better tree—either more accurate or more comprehensible.<br />
Consider also a planning agent that has to learn a decision tree from a given<br />
set of examples, while the time at which the model will be needed by the agent is<br />
not known in advance. In this case, the agent would like the <strong>learning</strong> procedure<br />
to learn the best tree it can until it is interrupted and queried <strong>for</strong> a solution.<br />
In both of the above scenarios, the <strong>learning</strong> algorithm is expected to exploit<br />
additional time allocation to produce a better tree. In the first case, the additional<br />
time is allocated in advance. In the second, it is not. Similar resource-bounded<br />
reasoning situations may occur in many real-life applications such as game playing,<br />
planning, stock trading and e-mail filtering. In this Chapter, we introduce a<br />
framework <strong>for</strong> exploiting extra time, preallocated or not, in order to learn smaller<br />
and more accurate trees (Esmeir & Markovitch, 2007b).<br />
3.1 Top-down Induction of Decision Trees<br />
TDIDT (top-down induction of decision trees) methods start from the entire set<br />
of training examples, partition it into subsets by testing the value of an attribute,<br />
and then recursively call the induction algorithm <strong>for</strong> each subset. Figure 3.1<br />
<strong>for</strong>malizes the basic algorithm <strong>for</strong> TDIDT. We focus first on consistent trees <strong>for</strong><br />
19