anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
attributes exceeds 35, but even with 60 attributes the algorithm achieves an<br />
accuracy of about 82%. The skewing <strong>algorithms</strong>, on the other hand, are much<br />
more sensitive to irrelevant attributes and their per<strong>for</strong>mance decreases drastically<br />
with the increase in the number of irrelevant attributes. When the number of<br />
attributes is more than 35, the skewing <strong>algorithms</strong> become no better than a<br />
random guesser. The consistent advantage of LSID3 is clear also in terms of tree<br />
size, where the trees produced by ID3 and skewing are significantly larger.<br />
To be fair, it is important to note that LSID3 had a much longer runtime<br />
than skewing with its default parameters. However, our previous experiments<br />
with parity concepts showed that the per<strong>for</strong>mance of skewing does not improve<br />
with time and hence the results are expected to be the same, even if skewing<br />
were to be allocated the same amount of time. To verify this, we repeated the<br />
experiment <strong>for</strong> 35 and 60 attributes and allocated skewing the same time as<br />
LSID3(r = 5). The results were similar to those reported in Figure 6.1 and no<br />
improvement in the per<strong>for</strong>mance of skewing was observed.<br />
6.1.2 Other Cost-insensitive Decision-Tree Inducers<br />
Papagelis and Kalles (2001) studied GATree, a learner that uses genetic <strong>algorithms</strong><br />
<strong>for</strong> building decision trees. GATree does not adopt the top-down scheme.<br />
Instead, it starts with a population of random trees and uses a mutation operation<br />
of randomly changing a splitting test and a crossover operation of exchanging subtrees.<br />
Unlike our approach, GATree is not designed to generate consistent decision<br />
trees and searches the space of all possible trees over a given set of attributes.<br />
Thus, it is not appropriate <strong>for</strong> applications where a consistent tree is required.<br />
Like most genetic <strong>algorithms</strong>, GATree requires cautious parameter tuning and its<br />
per<strong>for</strong>mance depends greatly on the chosen setting. Comparing GATree to our<br />
algorithm (see Section 3.7.6) shows that, especially <strong>for</strong> hard concepts, it is much<br />
better to invest the resources in careful tuning of a single tree than to per<strong>for</strong>m<br />
genetic search over the large population of decision trees.<br />
Utgoff et al. (1997) presented DMTI (Direct Metric Tree Induction), an<br />
induction algorithm that chooses an attribute by building a single decision tree<br />
under each candidate attribute and evaluates it using various measures. Several<br />
possible tree measures were examined and the MDL (Minimum Description<br />
Length) measure per<strong>for</strong>med best. DMTI is similar to LSID3(r = 1) but, unlike<br />
LSID3, it can only use a fixed amount of additional resources and hence cannot<br />
serve as an <strong>anytime</strong> algorithm. When the user can af<strong>for</strong>d using more resources<br />
than required by DMTI, the latter does not provide means to improve the learned<br />
model further. Furthermore, DMTI uses a single greedy lookahead tree <strong>for</strong> attribute<br />
evaluation, while we use a biased sample of the possible lookahead trees.<br />
Our experiments with DMTI (as available online) show that while it can solve<br />
124