Information Theory, Inference, and Learning ... - MAELabs UCSD

More documents

Recommendations

Info

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 140 8 — Dependent Random Variables 8.2 Exercises H(X) H(X, Y ) H(Y ) H(X | Y ) I(X; Y ) H(Y |X) ⊲ Exercise 8.1. [1 ] Consider three independent random variables u, v, w with entropies Hu, Hv, Hw. Let X ≡ (U, V ) and Y ≡ (V, W ). What is H(X, Y )? What is H(X | Y )? What is I(X; Y )? ⊲ Exercise 8.2. [3, p.142] Referring to the definitions of conditional entropy (8.3– 8.4), confirm (with an example) that it is possible for H(X | y = bk) to exceed H(X), but that the average, H(X | Y ), is less than H(X). So data are helpful – they do not increase uncertainty, on average. ⊲ Exercise 8.3. [2, p.143] Prove the chain rule for entropy, equation (8.7). [H(X, Y ) = H(X) + H(Y | X)]. Exercise 8.4. [2, p.143] Prove that the mutual information I(X; Y ) ≡ H(X) − H(X | Y ) satisfies I(X; Y ) = I(Y ; X) and I(X; Y ) ≥ 0. [Hint: see exercise 2.26 (p.37) and note that I(X; Y ) = DKL(P (x, y)||P (x)P (y)).] (8.11) Exercise 8.5. [4 ] The ‘entropy distance’ between two random variables can be defined to be the difference between their joint entropy and their mutual information: DH(X, Y ) ≡ H(X, Y ) − I(X; Y ). (8.12) Prove that the entropy distance satisfies the axioms for a distance – DH(X, Y ) ≥ 0, DH(X, X) = 0, DH(X, Y ) = DH(Y, X), and DH(X, Z) ≤ DH(X, Y ) + DH(Y, Z). [Incidentally, we are unlikely to see DH(X, Y ) again but it is a good function on which to practise inequality-proving.] Exercise 8.6. [2, p.147] A joint ensemble XY has the following joint distribution. P (x, y) x 1 2 3 4 1 1/8 1/16 1/32 1/32 y 2 1/16 1/8 1/32 1/32 3 1/16 1/16 1/16 1/16 4 1/4 0 0 0 1 2 3 4 1 2 3 4 What is the joint entropy H(X, Y )? What are the marginal entropies H(X) and H(Y )? For each value of y, what is the conditional entropy H(X | y)? What is the conditional entropy H(X | Y )? What is the conditional entropy of Y given X? What is the mutual information between X and Y ? Figure 8.1. The relationship between joint information, marginal entropy, conditional entropy and mutual entropy.
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 8.3: Further exercises 141 Exercise 8.7. [2, p.143] Consider the ensemble XY Z in which AX = AY = AZ = {0, 1}, x and y are independent with PX = {p, 1 − p} and PY = {q, 1−q} and z = (x + y) mod 2. (8.13) (a) If q = 1/2, what is PZ? What is I(Z; X)? (b) For general p and q, what is PZ? What is I(Z; X)? Notice that this ensemble is related to the binary symmetric channel, with x = input, y = noise, and z = output. H(X) Three term entropies H(X|Y) I(X;Y) H(Y|X) H(Y) H(X,Y) Exercise 8.8. [3, p.143] Many texts draw figure 8.1 in the form of a Venn diagram (figure 8.2). Discuss why this diagram is a misleading representation of entropies. Hint: consider the three-variable ensemble XY Z in which x ∈ {0, 1} and y ∈ {0, 1} are independent binary variables and z ∈ {0, 1} is defined to be z = x + y mod 2. 8.3 Further exercises The data-processing theorem The data processing theorem states that data processing can only destroy information. Exercise 8.9. [3, p.144] Prove this theorem by considering an ensemble W DR in which w is the state of the world, d is data gathered, and r is the processed data, so that these three variables form a Markov chain w → d → r, (8.14) that is, the probability P (w, d, r) can be written as P (w, d, r) = P (w)P (d | w)P (r | d). (8.15) Show that the average information that R conveys about W, I(W ; R), is less than or equal to the average information that D conveys about W , I(W ; D). This theorem is as much a caution about our definition of ‘information’ as it is a caution about data processing! Figure 8.2. A misleading representation of entropies (contrast with figure 8.1).
Page 1 and 2:
Copyright Cambridge University Pres
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102: Copyright Cambridge University Pres
Page 151: Copyright Cambridge University Pres
Page 203 and 204:
Page 205 and 206:
Page 207 and 208:
Page 209 and 210:
Page 211 and 212:
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
Page 227 and 228:
Page 229 and 230:
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Page 239 and 240:
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
Page 263 and 264:
Page 265 and 266:
Page 267 and 268:
Page 269 and 270:
Page 271 and 272:
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Page 287 and 288:
Page 289 and 290:
Page 291 and 292:
Page 293 and 294:
Page 295 and 296:
Page 297 and 298:
Page 299 and 300:
Page 301 and 302:
Page 303 and 304:
Page 305 and 306:
Page 307 and 308:
Page 309 and 310:
Page 311 and 312:
Page 313 and 314:
Page 315 and 316:
Page 317 and 318:
Page 319 and 320:
Page 321 and 322:
Page 323 and 324:
Page 325 and 326:
Page 327 and 328:
Page 329 and 330:
Page 331 and 332:
Page 333 and 334:
Page 335 and 336:
Page 337 and 338:
Page 339 and 340:
Page 341 and 342:
Page 343 and 344:
Page 345 and 346:
Page 347 and 348:
Page 349 and 350:
Page 351 and 352:
Page 353 and 354:
Page 355 and 356:
Page 357 and 358:
Page 359 and 360:
Page 361 and 362:
Page 363 and 364:
Page 365 and 366:
Page 367 and 368:
Page 369 and 370:
Page 371 and 372:
Page 373 and 374:
Page 375 and 376:
Page 377 and 378:
Page 379 and 380:
Page 381 and 382:
Page 383 and 384:
Page 385 and 386:
Page 387 and 388:
Page 389 and 390:
Page 391 and 392:
Page 393 and 394:
Page 395 and 396:
Page 397 and 398:
Page 399 and 400:
Page 401 and 402:
Page 403 and 404:
Page 405 and 406:
Page 407 and 408:
Page 409 and 410:
Page 411 and 412:
Page 413 and 414:
Page 415 and 416:
Page 417 and 418:
Page 419 and 420:
Page 421 and 422:
Page 423 and 424:
Page 425 and 426:
Page 427 and 428:
Page 429 and 430:
Page 431 and 432:
Page 433 and 434:
Page 435 and 436:
Page 437 and 438:
Page 439 and 440:
Page 441 and 442:
Page 443 and 444:
Page 445 and 446:
Page 447 and 448:
Page 449 and 450:
Page 451 and 452:
Page 453 and 454:
Page 455 and 456:
Page 457 and 458:
Page 459 and 460:
Page 461 and 462:
Page 463 and 464:
Page 465 and 466:
Page 467 and 468:
Page 469 and 470:
Page 471 and 472:
Page 473 and 474:
Page 475 and 476:
Page 477 and 478:
Page 479 and 480:
Page 481 and 482:
Page 483 and 484:
Page 485 and 486:
Page 487 and 488:
Page 489 and 490:
Page 491 and 492:
Page 493 and 494:
Page 495 and 496:
Page 497 and 498:
Page 499 and 500:
Page 501 and 502:
Page 503 and 504:
Page 505 and 506:
Page 507 and 508:
Page 509 and 510:
Page 511 and 512:
Page 513 and 514:
Page 515 and 516:
Page 517 and 518:
Page 519 and 520:
Page 521 and 522:
Page 523 and 524:
Page 525 and 526:
Page 527 and 528:
Page 529 and 530:
Page 531 and 532:
Page 533 and 534:
Page 535 and 536:
Page 537 and 538:
Page 539 and 540:
Page 541 and 542:
Page 543 and 544:
Page 545 and 546:
Page 547 and 548:
Page 549 and 550:
Page 551 and 552:
Page 553 and 554:
Page 555 and 556:
Page 557 and 558:
Page 559 and 560:
Page 561 and 562:
Page 563 and 564:
Page 565 and 566:
Page 567 and 568:
Page 569 and 570:
Page 571 and 572:
Page 573 and 574:
Page 575 and 576:
Page 577 and 578:
Page 579 and 580:
Page 581 and 582:
Page 583 and 584:
Page 585 and 586:
Page 587 and 588:
Page 589 and 590:
Page 591 and 592:
Page 593 and 594:
Page 595 and 596:
Page 597 and 598:
Page 599 and 600:
Page 601 and 602:
Page 603 and 604:
Page 605 and 606:
Page 607 and 608:
Page 609 and 610:
Page 611 and 612:
Page 613 and 614:
Page 615 and 616:
Page 617 and 618:
Page 619 and 620:
Page 621 and 622:
Page 623 and 624:
Page 625 and 626:
Page 627 and 628:
Page 629 and 630:
Page 631 and 632:
Page 633 and 634:
Page 635 and 636:
Page 637 and 638:
Page 639 and 640:
show all

Information Theory, Inference, and Learning ... - MAELabs UCSD

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?