Information Theory, Inference, and Learning ... - Inference Group

More documents

Recommendations

Info

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.428 33 — Variational Methods-2-2.5-3-3.5-4-4.5-5-5.5Free Energymean field theoryreal 8x8 system-60 1 2 3 4 5 6 7 80.70.60.50.40.30.20.1065Entropymean field theoryreal 8x8 system0 1 2 3 4 5 6 7 8Fluctuations, var(E)mean field theoryreal 8x8 system0-0.5-1-1.5-21.61.41.210.80.60.40.20Energymean field theoryreal 8x8 system0 1 2 3 4 5 6 7 8Heat Capacity, dE/dTmean field theoryreal 8x8 system-0.20 1 2 3 4 5 6 7 8Figure 33.3. Comparison ofapproximating distribution’sproperties with those of a real8 × 8 fragment. Notice that thevariational free energy of theapproximating distribution isindeed an upper bound on thefree energy of the real system. Allquantities are shown ‘per spin’.43210-10 1 2 3 4 5 6 7 8
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.33.5: The case of an unknown Gaussian 429the whole distribution. We may also be interested in its normalizing constantP (D | H) if we wish to do model comparison. The probability distributionP (w | D, H) is often a complex distribution. In a variational approach to inference,we introduce an approximating probability distribution over the parameters,Q(w; θ), and optimize this distribution (by varying its own parametersθ) so that it approximates the posterior distribution of the parametersP (w | D, H) well.One objective function we may choose to measure the quality of the approximationis the variational free energy∫˜F (θ) = d k Q(w; θ)w Q(w; θ) lnP (D | w, H)P (w | H) . (33.34)The denominator P (D | w, H)P (w | H) is, within a multiplicative constant, theposterior probability P (w | D, H) = P (D | w, H)P (w | H)/P (D | H). So thevariational free energy ˜F (θ) can be viewed as the sum of − ln P (D | H) andthe relative entropy between Q(w; θ) and P (w | D, H). ˜F (θ) is bounded belowby − ln P (D | H) and only attains this value for Q(w; θ) = P (w | D, H). Forcertain models and certain approximating distributions, this free energy, andits derivatives with respect to the approximating distribution’s parameters,can be evaluated.The approximation of posterior probability distributions using variationalfree energy minimization provides a useful approach to approximating Bayesianinference in a number of fields ranging from neural networks to the decoding oferror-correcting codes (Hinton and van Camp, 1993; Hinton and Zemel, 1994;Dayan et al., 1995; Neal and Hinton, 1998; MacKay, 1995a). The methodis sometimes called ensemble learning to contrast it with traditional learningprocesses in which a single parameter vector is optimized. Another name forit is variational Bayes. Let us examine how ensemble learning works in thesimple case of a Gaussian distribution.33.5 The case of an unknown Gaussian: approximating the posteriordistribution of µ and σWe will fit an approximating ensemble Q(µ, σ) to the posterior distributionthat we studied in Chapter 24,P (µ, σ | {x n } N n=1) = P ({x n} N n=1 | µ, σ)P (µ, σ)=P ({x n } N n=1 ) (33.35)()1exp − N(µ−¯x)2 +S 1 1(2πσ 2 ) N/2 2σ 2 σ µ σP ({x n } N n=1 ) . (33.36)We make the single assumption that the approximating ensemble is separablein the form Q(µ, σ) = Q µ (µ)Q σ (σ). No restrictions on the functional form ofQ µ (µ) and Q σ (σ) are made.We write down a variational free energy,∫Q µ (µ)Q σ (σ)˜F (Q) = dµ dσ Q µ (µ)Q σ (σ) lnP (D | µ, σ)P (µ, σ) . (33.37)We can find the optimal separable distribution Q by considering separatelythe optimization of ˜F over Q µ (µ) for fixed Q σ (σ), and then the optimizationof Q σ (σ) for fixed Q µ (µ).
Page 1 and 2:
Copyright Cambridge University Pres
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 80 and 81:
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
Page 108 and 109:
¥¡¥¡¥¡¥¡¥¡¥¡¥¡¥¡¥
Page 110 and 111:
Page 112 and 113:
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
Page 166 and 167:
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
Page 178 and 179:
Page 180 and 181:
Page 182 and 183:
Page 184 and 185:
Page 186 and 187:
Page 188 and 189:
Page 190 and 191:
Page 192 and 193:
Page 194 and 195:
Page 196 and 197:
Page 198 and 199:
Page 200 and 201:
Page 202 and 203:
Page 204 and 205:
Page 206 and 207:
Page 208 and 209:
Page 210 and 211:
Page 212 and 213:
Page 214 and 215:
Page 216 and 217:
Page 218 and 219:
Page 220 and 221:
Page 222 and 223:
Page 224 and 225:
Page 226 and 227:
Page 228 and 229:
Page 230 and 231:
Page 232 and 233:
Page 234 and 235:
Page 236 and 237:
Page 238 and 239:
Page 240 and 241:
Page 242 and 243:
Page 244 and 245:
Page 246 and 247:
Page 248 and 249:
Page 250 and 251:
Page 252 and 253:
Page 254 and 255:
Page 256 and 257:
Page 258 and 259:
Page 260 and 261:
Page 262 and 263:
Page 264 and 265:
Page 266 and 267:
Page 268 and 269:
Page 270 and 271:
Page 272 and 273:
Page 274 and 275:
Page 276 and 277:
Page 278 and 279:
Page 280 and 281:
Page 282 and 283:
Page 284 and 285:
Page 286 and 287:
Page 288 and 289:
Page 290 and 291:
Page 292 and 293:
Page 294 and 295:
Page 296 and 297:
Page 298 and 299:
Page 300 and 301:
Page 302 and 303:
Page 304 and 305:
Page 306 and 307:
Page 308 and 309:
Page 310 and 311:
Page 312 and 313:
Page 314 and 315:
Page 316 and 317:
Page 318 and 319:
Page 320 and 321:
Page 322 and 323:
Page 324 and 325:
Page 326 and 327:
Page 328 and 329:
Page 330 and 331:
Page 332 and 333:
Page 334 and 335:
Page 336 and 337:
Page 338 and 339:
Page 340 and 341:
Page 342 and 343:
Page 344 and 345:
Page 346 and 347:
Page 348 and 349:
Page 350 and 351:
Page 352 and 353:
Page 354 and 355:
Page 356 and 357:
Page 358 and 359:
Page 360 and 361:
Page 362 and 363:
Page 364 and 365:
Page 366 and 367:
Page 368 and 369:
Page 370 and 371:
Page 372 and 373:
Page 374 and 375:
Page 376 and 377:
Page 378 and 379:
Page 380 and 381:
Page 382 and 383:
Page 384 and 385:
Page 386 and 387:
Page 388 and 389:
Page 390 and 391: Copyright Cambridge University Pres
Page 490 and 491:
Page 492 and 493:
Page 494 and 495:
Page 496 and 497:
Page 498 and 499:
Page 500 and 501:
Page 502 and 503:
Page 504 and 505:
Page 506 and 507:
Page 508 and 509:
Page 510 and 511:
Page 512 and 513:
Page 514 and 515:
¡¤¢¢¤¨©¢£¡¢£¨¢£ ©Co
Page 516 and 517:
Page 518 and 519:
Page 520 and 521:
Page 522 and 523:
Page 524 and 525:
Page 526 and 527:
Page 528 and 529:
Page 530 and 531:
Page 532 and 533:
Page 534 and 535:
Page 536 and 537:
Page 538 and 539:
Page 540 and 541:
Page 542 and 543:
Page 544 and 545:
Page 546 and 547:
Page 548 and 549:
Page 550 and 551:
Page 552 and 553:
Page 554 and 555:
ttttCopyright Cambridge University
Page 556 and 557:
Page 558 and 559:
Page 560 and 561:
Page 562 and 563:
Page 564 and 565:
Page 566 and 567:
Page 568 and 569:
Page 570 and 571:
Page 572 and 573:
Page 574 and 575:
Page 576 and 577:
Page 578 and 579:
Page 580 and 581:
Page 582 and 583:
Page 584 and 585:
Page 586 and 587:
Page 588 and 589:
Page 590 and 591:
Page 592 and 593:
Page 594 and 595:
Page 596 and 597:
Page 598 and 599:
Page 600 and 601:
Page 602 and 603:
Page 604 and 605:
Page 606 and 607:
Page 608 and 609:
Page 610 and 611:
Page 612 and 613:
Page 614 and 615:
Page 616 and 617:
Page 618 and 619:
Page 620 and 621:
Page 622 and 623:
Page 624 and 625:
Page 626 and 627:
Page 628 and 629:
Page 630 and 631:
Page 632 and 633:
Page 634 and 635:
Page 636 and 637:
Page 638 and 639:
Page 640:
show all

Information Theory, Inference, and Learning ... - Inference Group

Create successful ePaper yourself

Delete template?

Save as template?