- Page 2 and 3: Machine Learning Tom M. Mitchell Pr
- Page 4 and 5: xvi PREFACE A third principle that
- Page 14 and 15: 2 MACHINE LEARNING seems inevitable
- Page 16 and 17: 4 MACHINE LEARNING Artificial intel
- Page 18 and 19: present. Notice in this last case t
- Page 20 and 21: 4. if b is a not a final state in t
- Page 22 and 23: 10 MACHINE LEARNING Below we descri
- Page 24 and 25: 12 MACHINE LEARNING Experiment Gene
- Page 26 and 27: particular features, then our progr
- Page 28 and 29: 16 MACHINE LEARNING 1.4 HOW TO READ
- Page 30 and 31: 18 MACHINE LEARNING 0 Learning invo
- Page 32 and 33: CHAPTER CONCEPT LEARNING AND THE GE
- Page 34 and 35: 22 MACHINE LEARNING The most genera
- Page 36 and 37: 3 .2 2 .2 2 .2 = 96 distinct instan
- Page 38 and 39: 1. Initialize h to the most specifi
- Page 40 and 41: populated by many different algorit
- Page 42 and 43: called the version space with respe
- Page 44 and 45: As long as the sets G and S are wel
- Page 46 and 47: 34 MACHINE LEARNING 1 S 1 : { } t S
- Page 48 and 49: S 3: {) I S 4: I ( ) I Training Ex
- Page 50 and 51: general? Clearly, the learner shoul
- Page 52 and 53: size of this hypothesis space influ
- Page 54 and 55: hypothesis h' in the power set that
- Page 56 and 57:
44 MACHINE LEARNING Training exampl
- Page 58 and 59:
The FINDS algorithm utilizes this g
- Page 60 and 61:
2.2. Give the sequence of S and G b
- Page 62 and 63:
50 MACHINE LEARNING 2.8. In this ch
- Page 64 and 65:
CHAPTER DECISION TREE LEARNING Deci
- Page 66 and 67:
54 MACHINE LEARNWG 3.3 APPROPRIATE
- Page 68 and 69:
ID3(Examples, Targetattribute, Attr
- Page 70 and 71:
elative to a collection of examples
- Page 72 and 73:
the algorithm, in which the topmost
- Page 74 and 75:
+ - + F: FIGURE 3.5 Hypothesis spac
- Page 76 and 77:
Consider the difference between the
- Page 78 and 79:
e true for instances that are class
- Page 80 and 81:
further elaboration of the tree dec
- Page 82 and 83:
original over the validation set. T
- Page 84 and 85:
set, minus 1.96 times the estimated
- Page 86 and 87:
74 MACHINE LEARNING where S1 throug
- Page 88 and 89:
y the cost of the attribute, so tha
- Page 90 and 91:
(a) What is the entropy of this col
- Page 92 and 93:
80 MACHINE LEARNING Quinlan, J. R.,
- Page 94 and 95:
4.1.1 Biological Motivation The stu
- Page 96 and 97:
Straight E2' Ahead 1 1 1 30 Output
- Page 98 and 99:
multilayer networks of such units a
- Page 100 and 101:
the inputs are fed to multiple unit
- Page 102 and 103:
we assume these are fixed during tr
- Page 104 and 105:
gradient can be obtained by differe
- Page 106 and 107:
94 MACHINE LEARNING descent step si
- Page 108 and 109:
head hid 4 who'd hood . 0 bad hid +
- Page 110 and 111:
B~c~~~o~~GATIO~(trainingaxamp~es, q
- Page 112 and 113:
100 MACHINE LEARNING criterion. The
- Page 114 and 115:
102 MACHINE LEARNING chain rule to
- Page 116 and 117:
4.6 REMARKS ON THE BACKPROPAGATION
- Page 118 and 119:
keep in mind that the network weigh
- Page 120 and 121:
greater detailt. The network in Fig
- Page 122 and 123:
110 MACHINE LEARNING Error versus w
- Page 124 and 125:
I 112 MACHINE LEARNWG severe for sm
- Page 126 and 127:
the target function quite well. Aft
- Page 128 and 129:
number of training iterations was s
- Page 130 and 131:
118 MACHINE LEARNING tkd should var
- Page 132 and 133:
I 120 MACHINE LEARNING (4 Feedforwa
- Page 134 and 135:
122 MACHINE LEARNING grown in this
- Page 136 and 137:
simple functions such as XOR could
- Page 138 and 139:
have problems with local minima? Ho
- Page 140 and 141:
CHAPTER EVALUATING HYPOTHESES Empir
- Page 142 and 143:
about whether x is a positive or ne
- Page 144 and 145:
ConfidencelevelN%: 50% 68% 80% 90%
- Page 146 and 147:
section, if we were to repeat this
- Page 148 and 149:
4. The probability that the random
- Page 150 and 151:
estimation bias is a numerical quan
- Page 152 and 153:
mean p and standard deviation a. Fo
- Page 154 and 155:
142 MACHINE LEARNING 5.4 A GENERAL
- Page 156 and 157:
144 MACHINE LEARNING follow a distr
- Page 158 and 159:
of the difference in their errors w
- Page 160 and 161:
Confidence level N 90% 95% 98% 99%
- Page 162 and 163:
above, in which we assume unlimited
- Page 164 and 165:
Geman et al. (1992) discuss the tra
- Page 166 and 167:
CHAPTER BAYESIAN LEARNING Bayesian
- Page 168 and 169:
The remainder of this chapter is or
- Page 170 and 171:
test with two possible outcomes: $
- Page 172 and 173:
160 MACHINE LEARNING This algorithm
- Page 174 and 175:
where IVSH,DI is the number of hypo
- Page 176 and 177:
6.4 MAXIMUM LIKELIHOOD AND LEAST-SQ
- Page 178 and 179:
Given that the noise ei obeys a Nor
- Page 180 and 181:
Consider the setting in which we wi
- Page 182 and 183:
similar, but makes the additional a
- Page 184 and 185:
and a closely related principle cal
- Page 186 and 187:
as its correct classification (whic
- Page 188 and 189:
For example, in learning boolean co
- Page 190 and 191:
e estimated from the training data
- Page 192 and 193:
priors; that is, if an attribute ha
- Page 194 and 195:
P(a, = wkJvj) for all i, j, k, m. T
- Page 196 and 197:
TABLE 6.3 Twenty usenet newsgroups
- Page 198 and 199:
S,B S,-B 7S.B 1s.-B -C 0.6 0.9 0.2
- Page 200 and 201:
networks can be NP-hard (Dagum and
- Page 202 and 203:
Applying Bayes theorem to rewrite P
- Page 204 and 205:
FIGURE 6.4 Instances generated by a
- Page 206 and 207:
6.12.2 General Statement of EM Algo
- Page 208 and 209:
Finally we must take the expected v
- Page 210 and 211:
There are many good introductory te
- Page 212 and 213:
200 MACHINE LEARNING Heckerman, D.,
- Page 214 and 215:
202 MACHINE LEARNING will make befo
- Page 216 and 217:
describable by conjunctions of the
- Page 218 and 219:
a hypothesis h for which errorD(h)
- Page 220 and 221:
The significance of the version spa
- Page 222 and 223:
210 MACHINE LEARNING Note that the
- Page 224 and 225:
212 MACHINE LEARNING any given hypo
- Page 226 and 227:
ewritten as a k-CNF expression (but
- Page 228 and 229:
form a < x < b, where a and b may b
- Page 230 and 231:
This theorem states that if the num
- Page 232 and 233:
220 MACHINE LEARNING trains a netwo
- Page 234 and 235:
222 MACHINE LEARNING This combinati
- Page 236 and 237:
ai denotes the if* prediction algor
- Page 238 and 239:
est possible hypothesis in H, after
- Page 240 and 241:
7.5. Consider the space of instance
- Page 242 and 243:
CHAPTER INSTANCE-BASED LEARNING In
- Page 244 and 245:
I Euclidean distance. More precisel
- Page 246 and 247:
This can be accomplished by replaci
- Page 248 and 249:
One additional practical issue in a
- Page 250 and 251:
this approach requires computation
- Page 252 and 253:
A second approach is to choose a se
- Page 254 and 255:
a "-" label indicates that the vari
- Page 256 and 257:
244 MACHINE LEARNING captured by th
- Page 258 and 259:
1 246 MACHINE LEARNING 0 Advantages
- Page 260 and 261:
1 248 MACHINE LEARNING Bishop, C. M
- Page 262 and 263:
a collection of hypotheses called t
- Page 264 and 265:
Thus, the probability that a hypoth
- Page 266 and 267:
The crossover operator produces two
- Page 268 and 269:
1 256 MACHINE LEARNING tournament s
- Page 270 and 271:
then the two resulting offspring wi
- Page 272 and 273:
9.4.1 Population Evolution and the
- Page 274 and 275:
containing a small number of define
- Page 276 and 277:
FIGURE 9.3 A block-stacking problem
- Page 278 and 279:
producing 10% of the successor popu
- Page 280 and 281:
268 MACHINE LEARNING was allowed, t
- Page 282 and 283:
the population itself is the rule s
- Page 284 and 285:
Holland, J. H. (1986). Escaping bri
- Page 286 and 287:
CHAPTER LEARNING SETS OF RULES One
- Page 288 and 289:
positive examples and few of the ne
- Page 290 and 291:
search, there is a danger that a su
- Page 292 and 293:
above discussion is based. Like CN2
- Page 294 and 295:
A final dimension is the particular
- Page 296 and 297:
First-order Horn clauses may also r
- Page 298 and 299:
more restricted than general Horn c
- Page 300 and 301:
0 The negation of either of the abo
- Page 302 and 303:
This Foil-Gain function has a strai
- Page 304 and 305:
Note that the target literal Child(
- Page 306 and 307:
It is easiest to introduce the reso
- Page 308 and 309:
generate candidate hypotheses hi th
- Page 310 and 311:
Father (Shannon, Tom) GrandChild(Bo
- Page 312 and 313:
What is the relationship between th
- Page 314 and 315:
This approach is exemplified by the
- Page 316 and 317:
10.6. Apply inverse resolution to t
- Page 318 and 319:
Quinlan, J. R. (1991). Improved est
- Page 320 and 321:
algorithms are all examples of indu
- Page 322 and 323:
11.1.1 Inductive and Analytical Lea
- Page 324 and 325:
describing why black would lose its
- Page 326 and 327:
PROWG-EBG(TargetConcept, TrainingEx
- Page 328 and 329:
The above rule constitutes a signif
- Page 330 and 331:
R~~~~ss(Frontier, Rule, Literal, &i
- Page 332 and 333:
and (b) classify the observed train
- Page 334 and 335:
alternative hypotheses that satisfy
- Page 336 and 337:
call LEMMA-ENUMERATOR. The LEMMA-EN
- Page 338 and 339:
Exactly how should we formulate the
- Page 340 and 341:
and deleting rules later found to h
- Page 342 and 343:
(1983); and Silver (1983). DeJong a
- Page 344 and 345:
DeJong, G. (1981). Generalizations
- Page 346 and 347:
CHAPTER COMBINING INDUCTIVE AND ANA
- Page 348 and 349:
Inductive learning plentiful data N
- Page 350 and 351:
the data a little better at the exp
- Page 352 and 353:
addition of multiple literals in a
- Page 354 and 355:
Domain theory: Cup t Stable, Lzpabl
- Page 356 and 357:
344 MACHINE LEARNING Expensive a;.-
- Page 358 and 359:
Hypothesis Space Initial hypothesis
- Page 360 and 361:
providing training derivatives, or
- Page 362 and 363:
12.4.3 Remarks To summarize, TANGEN
- Page 364 and 365:
explained by the domain theory and
- Page 366 and 367:
354 MACHINE LEARNING hold when x =
- Page 368 and 369:
EBNN generalizes more accurately th
- Page 370 and 371:
Cup C [2+,3-I I( i \ Cup C Fragile
- Page 372 and 373:
preconditions, thereby allowing the
- Page 374 and 375:
12.7 SUMMARY AND FURTHER READING Th
- Page 376 and 377:
each hypothesis h(x) is of the form
- Page 378 and 379:
Thrun, S. (1996). Explanation based
- Page 380 and 381:
This chapter is concerned with how
- Page 382 and 383:
ehind it. In such cases, it may be
- Page 384 and 385:
(s, a) (immediate reward) values Q(
- Page 386 and 387:
374 MACHINE LEARNING 13.3.1 The Q F
- Page 388 and 389:
ounded, and actions are chosen so t
- Page 390 and 391:
convergence results later. However,
- Page 392 and 393:
goal state and placing it at a new
- Page 394 and 395:
which is the generalization of the
- Page 396 and 397:
with no contribution from the curre
- Page 398 and 399:
to solving MDPs. Bellman's equation
- Page 400 and 401:
EXERCISES 13.1. Give a second optim
- Page 402 and 403:
Mahadevan, S. (1996). Average rewar
- Page 404 and 405:
E : r]: P: n: VE(G): C: D : A bound
- Page 406 and 407:
400 SUBJECT INDEX SUBJECT INDEX Pag
- Page 408 and 409:
CART system, 77 CASCADE-CORRELATION
- Page 410 and 411:
Entailment, 321n relationship with
- Page 412 and 413:
Hypotheses, estimation of accuracy
- Page 414 and 415:
LEARN-ONE-RULE algorithm: FOIL algo
- Page 416 and 417:
410 SUBJECT INDEX Normal distributi
- Page 418 and 419:
Resolution rule, 293-294 first-orde
- Page 420:
Variables, in logic, 284, 285 Varia