10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.28 2 — Probability, Entropy, <strong>and</strong> <strong>Inference</strong>u012345678910Figure 2.5. Joint probability of u<strong>and</strong> n B for Bill <strong>and</strong> Fred’s urnproblem, after N = 10 draws.0 1 2 3 4 5 6 7 8 9 10 n BThe marginal probability of u is P (u) = 111for all u. You wrote down theprobability of n B given u <strong>and</strong> N, P (n B | u, N), when you solved exercise 2.4(p.27). [You are doing the highly recommended exercises, aren’t you?] If wedefine f u ≡ u/10 thenP (n B | u, N) =( Nn B)f n Bu (1 − f u ) N−n B. (2.23)What about the denominator, P (n B | N)? This is the marginal probability ofn B , which we can obtain using the sum rule:P (n B | N) = ∑ uP (u, n B | N) = ∑ uSo the conditional probability of u given n B isP (u)P (n B | u, N). (2.24)P (u | n B , N) = P (u)P (n B | u, N)(2.25)P (n B | N)( )1 1 N=f n BuP (n B | N) 11 n (1 − f u) N−n B. (2.26)BThis conditional distribution can be found by normalizing column 3 offigure 2.5 <strong>and</strong> is shown in figure 2.6. The normalizing constant, the marginalprobability of n B , is P (n B = 3 | N = 10) = 0.083. The posterior probability(2.26) is correct for all u, including the end-points u = 0 <strong>and</strong> u = 10, wheref u = 0 <strong>and</strong> f u = 1 respectively. The posterior probability that u = 0 givenn B = 3 is equal to zero, because if Fred were drawing from urn 0 it would beimpossible for any black balls to be drawn. The posterior probability thatu = 10 is also zero, because there are no white balls in that urn. The otherhypotheses u = 1, u = 2, . . . u = 9 all have non-zero posterior probability. ✷0.30.250.20.150.10.0500 1 2 3 4 5 6 7 8 9 10u P (u | n B = 3, N)0 01 0.0632 0.223 0.294 0.245 0.136 0.0477 0.00998 0.000869 0.000009610 0Figure 2.6. Conditionalprobability of u given n B = 3 <strong>and</strong>N = 10.uTerminology of inverse probabilityIn inverse probability problems it is convenient to give names to the probabilitiesappearing in Bayes’ theorem. In equation (2.25), we call the marginalprobability P (u) the prior probability of u, <strong>and</strong> P (n B | u, N) is called the likelihoodof u. It is important to note that the terms likelihood <strong>and</strong> probabilityare not synonyms. The quantity P (n B | u, N) is a function of both n B <strong>and</strong>u. For fixed u, P (n B | u, N) defines a probability over n B . For fixed n B ,P (n B | u, N) defines the likelihood of u.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!