18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

a1<br />

a1<br />

a1<br />

a1<br />

a1<br />

a1<br />

a1<br />

E-MC=<br />

$110<br />

a1<br />

a1<br />

E-MC=<br />

$250<br />

a1<br />

a1<br />

E-MC=<br />

$170<br />

a1<br />

ρ c = $60<br />

Figure 5.3: Attribute evaluation in Pre-Contract-TATA. E-MC stands <strong>for</strong> the expected<br />

misclassification cost. For each candidate split, we sample the space of trees<br />

under it that fit the remaining budget ($60 in the example) and evaluate the split by<br />

the minimal expected misclassification cost in the sample ($110 in the example).<br />

linearly with r, just as it does in ACT (Esmeir & Markovitch, 2007a). When we<br />

cannot af<strong>for</strong>d sampling (r = 0), TATA builds the tree using C4.5$.<br />

5.1.4 Interruptible Learning of Pre-contract Classifiers<br />

The algorithm presented in Section 5.1.3 requires r, the sample size, as a parameter.<br />

When the <strong>learning</strong> resources are not predetermined, we would like the<br />

learner to utilize extra time until interrupted. In Chapter 3 we presented IIDT,<br />

a general framework <strong>for</strong> Interruptible Induction of Decision Trees, that need not<br />

be allocated resources ahead of time. IIDT starts with building a greedy tree.<br />

Then, it repeatedly selects a subtree whose reconstruction is expected to yield<br />

the highest marginal utility, and rebuilds the subtree with a doubled allocation<br />

of resources.<br />

The same iterative improvement approach can be applied to convert preconstract-TATA<br />

into an interruptible algorithm. The initial greedy tree would be<br />

built with C4.5$, and subtree reconstructions would be made using pre-contract-<br />

TATA. The marginal utility of constructing a tree would take into account both<br />

the expected misclassification cost of the tree and the expected resources required<br />

104<br />

a1<br />

a1<br />

a1<br />

a1<br />

The space of trees whose test cost ≤ $60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!