anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 for τ is the runtime of LSID3(r = 1). In many cases we would like to stop the algorithm earlier. When we do, the sequencing approach will return the initial greedy tree. Our interruptible anytime framework, called IIDT, overcomes these problems by iteratively improving the tree rather than trying to rebuild it. 3.6.2 Iterative Improvement of Decision Trees Iterative improvement approaches that start with an initial solution and repeatedly modify it have been successfully applied to many AI domains, such as CSP, timetabling, and combinatorial problems (Minton, Johnston, Philips, & Laird, 1992; Johnson, Aragon, McGeoch, & Schevon, 1991; Schaerf, 1996). The key idea behind iterative improvement techniques is to choose an element of the current suboptimal solution and improve it by local repairs. IIDT adopts the above approach for decision tree learning and iteratively revises subtrees. It can serve as an interruptible anytime algorithm because, when interrupted, it can immediately return the currently best solution available. As in LSID3, IIDT exploits additional resources in an attempt to produce better decision trees. The principal difference between the algorithms is that LSID3 uses the available resources to induce a decision tree top-down, where each decision made at a node is final and does not change. IIDT, however, is not allocated resources in advance and uses extra time resources to modify split decisions repeatedly. IIDT receives a set of examples and a set of attributes. It first performs a quick construction of an initial decision tree by calling ID3. Then it iteratively attempts to improve the current tree by choosing a node and rebuilding its subtree with more resources than those used previously. If the newly induced subtree is better, it will replace the existing one. IIDT is formalized in Figure 3.12. Figure 3.13 illustrates how IIDT works. The target concept is a1 ⊕ a2, with two additional irrelevant attributes, a3 and a4. The leftmost tree was constructed using ID3. In the first iteration, the subtree rooted at the bolded node is selected for improvement and replaced by a smaller tree (surrounded by a dashed line). Next, the root is selected for improvement and the whole tree is replaced by a tree that perfectly describes the concept. While in this particular example IIDT first chooses to rebuild a subtree at depth 2 and then at depth 1, it considers all subtrees, regardless of their level. IIDT is designed as a general framework for interruptible learning of decision trees. It can use different approaches for choosing which node to improve, for allocating resources for an improvement iteration, for rebuilding a subtree, and for deciding whether an alternative subtree is better. After the subtree to be rebuilt is chosen and the resources for a reconstruction iteration allocated, the problem becomes a task for a contract algorithm. A good 36
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Procedure IIDT(E, A) T ← Greedy-Build-Tree(E, A) While not-interrupted node ← Choose-Node(T, E, A) t ← subtree of T rooted at node r ← Next-R(node) Enode ← Examples-At(node, T, E) Anode ← Attributes-At(node, T, A) t ′ ← Rebuild-Tree(Enode, Anode, r) If Better(t ′ , t) replace t with t ′ Return T Procedure Examples-At(node, T, E) Return {e ∈ E | e reaches node} Procedure Attributes-At(node, T, A) Return {a ∈ A | a /∈ ancestor of node}∪ {a ∈ A | a is numeric} Procedure Next-R(node) If Last-r(node) = 0 Return 1 Else Return 2 · Last-R(node) Figure 3.12: IIDT: framework for interruptible induction of decision trees candidate for such an algorithm is LSID3, which is expected to produce better subtrees when invoked with a higher resource allocation. In what follows we focus on the different components of IIDT and suggest a possible implementation that uses LSID3 for revising subtrees. Choosing a Subtree to Improve Intuitively, the next node we would like to improve is the one with the highest expected marginal utility, i.e., the one with the highest ratio between the expected benefit and the expected cost (Horvitz, 1990; Russell & Wefald, 1989). Estimating the expected gain and expected cost of rebuilding a subtree is a difficult problem. There is no apparent way to estimate the expected improvement in terms of either 37
Page 1 and 2: Technion - Computer Science Departm
Page 51: Technion - Computer Science Departm
Page 103 and 104:
Technion - Computer Science Departm
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?