anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 produce consistent models because they output a pruned tree in attempt to avoid overfitting the data. This is usually done in two phases: first a tree is grown topdown and then it is pruned. In our second set of experiments, we examine whether taking simplicity as a bias in the first stage is beneficial even when the tree is later pruned. Therefore, we measure the correlation between the size of unpruned trees and their accuracy after pruning. Table B.2 summarizes the results. The same statistics were measured with a single change: the accuracy was measured after applying error-based pruning (Quinlan, 1993). As in the case of consistent trees, examining the overall correlation between the size of a tree and its accuracy indicates that for most datasets the inverse correlation is statistically significant. Table B.2 also gives the percentages of pruned leaves. While the trees produced by RTG were aggressively pruned, the percentage of pruned leaves in LSID3 was relatively low. This is due to the stronger support for the decisions made at the leaves of smaller consistent trees. B.4 Conclusions Occam’s razor states that given two consistent hypotheses, the simpler one should be preferred. This principle has been the subject of a heated debate, with theoretical and empirical arguments both for and against it. In this work we provided convincing empirical evidence for the validity of Occam’s principle with respect to decision trees. We state Occam’s empirical principle, which is well-defined for any learning problem consisting of a training set and a testing set, and show experimentally that the principle is valid for many known learning problems. Note that our study is purely empirical and does not attempt to reach an indisputable conclusion about Occam’s razor as an epistemological concept. Our testing methodology uses various sampling techniques to sample the version space and applies Spearman’s correlation test to measure the monotonic association between the size of a tree and its accuracy. Our experiments confirm Occam’s empirical principle and show that the negative correlation between size and accuracy is strong. Although there were several exceptions, we conclude that in general, simpler trees are likely to be more accurate. Observe that our results do not contradict those reported by Murphy and Pazzani (1994), but complement them: we do not claim that a smaller tree is always more accurate, but show that for many domains smaller trees are likely to be more accurate. We view our results as strong empirical evidence for the utility of Occam’s razor to decision tree induction. It is important to note that the datasets used in our study do not necessarily represent all possible learning problems. However, these datasets are frequently used in machine learning research and considered 152
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 typical tasks for induction algorithms. We also examined the applicability of Occam’s razor to pruned trees. For most domains we found the correlation between the size of a tree before pruning and its accuracy after pruning to be statistically significant. These results indicate that taking Occam’s razor as a preference bias when growing a tree is useful even if the tree is post-pruned. An interesting research direction that we plan to pursue is to test the correlation between a tree’s accuracy and a variety of other measures, such as expected error and description length. 153
Page 1 and 2:
Technion - Computer Science Departm
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118: Technion - Computer Science Departm
Page 167: Technion - Computer Science Departm
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

Create successful ePaper yourself

Delete template?

Save as template?