anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 26. Multi-XOR / Multi-AND-OR: These concepts are defined over 11 binary attributes. In both cases the target concept is composed of several subconcepts, where the first two attributes determines which of them is considered. The other 10 attributes are used to form the subconcepts. In the Multi-XOR dataset, each subconcept is an XOR, and in the Multi-AND-OR dataset, each subconcept is either AND or OR. 142
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Appendix B Occam’s Razor Just Got Sharper Occam’s razor, attributed to the 14th-century English logician William of Ockham, is the principle that, given two hypotheses consistent with the observed data, the simpler one should be preferred. This principle has become the basis for many induction algorithms that search for a small hypothesis within the version space (Mitchell, 1982). Several studies attempted to justify Occam’s razor with theoretical and empirical arguments (Blumer et al., 1987; Quinlan & Rivest, 1989; Fayyad & Irani, 1990). But a number of recent works have questioned the utility of Occam’s razor, and provided theoretical and experimental evidence against it. Schaffer (1994) proved that no learning bias can outperform another bias over the space of all possible learning tasks. This looks like theoretical evidence against Occam’s razor. Rao et al. (1995), however, argued against the applicability of this result to real-world problems by questioning the validity of its basic assumption about the uniform distribution of possible learning tasks. Domingos (1999a) argued that the disagreement about the utility of Occam’s razor stems from the two different interpretations given to it: the first is that simplicity is a goal in and of itself, and the second is that simplicity leads to better accuracy. While accepting the first interpretation, Domingos questioned the second one. Webb (1996) presented C4.5X, an extension to C4.5 that uses similarity considerations to further specialize consistent leaves. Webb reported an empirical evaluation which shows that C4.5X has a slight advantage in a few domains and argued that these results discredit Occam’s thesis. Murphy and Pazzani (1994) reported a set of experiments in which all possible consistent trees were produced and their accuracy was tested. Their findings were inconclusive. They found cases where larger trees had, on average, better accuracy. Still, they recommend using Occam’s principle when no additional information about the concept is available. The major limitation of their work 143
Page 1 and 2:
Technion - Computer Science Departm
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108: Technion - Computer Science Departm
Page 157: Technion - Computer Science Departm
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

Create successful ePaper yourself

Delete template?

Save as template?