18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Procedure Contract-TATA-Classify(Π, e, ρ c )<br />

ρ c∗ ← max{ρ c< | ρ c< ≤ ρ c ∧ ∃T, 〈ρ c< , T 〉 ∈ Π}<br />

T ∗ ← T | 〈ρ c∗ , T 〉 ∈ Π<br />

Return Classify(T ∗ , e)<br />

Figure 5.6: Classifying a case using a repertoire in contract-TATA<br />

a case under the contract setup.<br />

5.2.2 Learning a repertoire with Nonuni<strong>for</strong>m Cost Gaps<br />

In repertoires with uni<strong>for</strong>m cost distribution, every two successive trees are built<br />

with maximal cost allocations that differ by δ. In many domains, however, increasing<br />

the contract by δ results in the same tree or a tree that per<strong>for</strong>ms similarly.<br />

In such a case, the ef<strong>for</strong>ts to build the second tree are wasted. These ef<strong>for</strong>ts could<br />

have been invested in more interesting cost regions where increasing the contract<br />

is useful. This problem is intensified when k, the repertoire size, is small. Consider,<br />

<strong>for</strong> example, a problem with 4 attributes, 3 of which cost $1 and 1 of which<br />

costs $13. In this case, ρ c min = 1 and ρ c max = 16. Assume that we would like<br />

to build a repertoire of size k = 4. Because δ = 5, the four trees will be built<br />

with ρ c = 1, 6, 11, 16 respectively. As a result, the 2 nd and the 3 rd trees would be<br />

identical because they can test the same set of attributes (all but the expensive<br />

one).<br />

Another problem with the uni<strong>for</strong>m approach is the requirement that k, the<br />

repertoire size, be decided in advance. This is af<strong>for</strong>dable when <strong>learning</strong> resources<br />

are predetermined (contract <strong>learning</strong>) but imposes a problem if the <strong>learning</strong> process<br />

itself needs to be interruptible. For example, in many real-world problems<br />

<strong>learning</strong> resources are not preallocated; rather, the <strong>learning</strong> algorithm is expected<br />

to exploit any additional time until interrupted.<br />

To overcome the a<strong>for</strong>ementioned shortcomings, we propose a different approach<br />

<strong>for</strong> choosing the sequence of ρ c values in a repertoire. Rather than uni<strong>for</strong>m<br />

gaps, we propose an iterative improvement solution. We start with a repertoire<br />

that consists of two trees: one built with ρc min and the other with ρcmax . We<br />

then repeatedly choose two successive trees, T1 and T2, built with ρc 1 and ρc 2<br />

respectively, and add another tree with ρc = 0.5(ρc 1 + ρc2 ).<br />

Choosing ρc 1 and ρc2 is not trivial. We are interested in maximizing the benefit<br />

from adding a new tree. There<strong>for</strong>e, ρ c 2 − ρ c 1 should be large enough to cover a<br />

wide range of potential ρ c values. At the same time, the expected reduction in<br />

misclassification costs, if a tree is built in between, should be significant. To<br />

estimate the latter we use the expected errors of the built trees. Two successive<br />

107

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!