10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.2.3: Forward probabilities <strong>and</strong> inverse probabilities 31You may also find it instructive to look back at example 2.6 (p.27) <strong>and</strong>equation (2.31).People sometimes confuse assigning a prior distribution to an unknown parametersuch as f H with making an initial guess of the value of the parameter.But the prior over f H , P (f H ), is not a simple statement like ‘initially, I wouldguess f H = 1/ 2’. The prior is a probability density over f H which specifies theprior degree of belief that f H lies in any interval (f, f + δf). It may well bethe case that our prior for f H is symmetric about 1/ 2, so that the mean of f Hunder the prior is 1/ 2. In this case, the predictive distribution for the first tossx 1 would indeed be∫∫P (x 1 = head) = df H P (f H )P (x 1 = head | f H ) = df H P (f H )f H = 1/ 2.(2.33)But the prediction for subsequent tosses will depend on the whole prior distribution,not just its mean.Data compression <strong>and</strong> inverse probabilityConsider the following task.Example 2.9. Write a computer program capable of compressing binary fileslike this one:000000000000000000001001000100000010000001000000000000000000000000000000000000101000000000000011000010000000000100001000000000100000000000000000000001000000000000000001000000000110000010000000110001000000000001001000000000010001000000000000000011000000000000000000000000000010000000000000000100000000The string shown contains n 1 = 29 1s <strong>and</strong> n 0 = 271 0s.Intuitively, compression works by taking advantage of the predictability of afile. In this case, the source of the file appears more likely to emit 0s than1s. A data compression program that compresses this file must, implicitly orexplicitly, be addressing the question ‘What is the probability that the nextcharacter in this file is a 1?’Do you think this problem is similar in character to example 2.7 (p.30)?I do. One of the themes of this book is that data compression <strong>and</strong> datamodelling are one <strong>and</strong> the same, <strong>and</strong> that they should both be addressed, likethe urn of example 2.6, using inverse probability. Example 2.9 is solved inChapter 6.The likelihood principlePlease solve the following two exercises.ABExample 2.10. Urn A contains three balls: one black, <strong>and</strong> two white; urn Bcontains three balls: two black, <strong>and</strong> one white. One of the urns isselected at r<strong>and</strong>om <strong>and</strong> one ball is drawn. The ball is black. What isthe probability that the selected urn is urn A?Example 2.11. Urn A contains five balls: one black, two white, one green <strong>and</strong>one pink; urn B contains five hundred balls: two hundred black, onehundred white, 50 yellow, 40 cyan, 30 sienna, 25 green, 25 silver, 20gold, <strong>and</strong> 10 purple. [One fifth of A’s balls are black; two-fifths of B’sare black.] One of the urns is selected at r<strong>and</strong>om <strong>and</strong> one ball is drawn.The ball is black. What is the probability that the urn is urn A?Figure 2.7. Urns for example 2.10.g p yc.........gpsFigure 2.8. Urns for example 2.11.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!