10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.About Chapter 1In the first chapter, you will need to be familiar with the binomial distribution.And to solve the exercises in the text – which I urge you to do – you will needto know Stirling’s approximation for the factorial function, x! ≃ x x e −x , <strong>and</strong>be able to apply it to ( NrThe binomial distribution)=N!(N−r)! r!. These topics are reviewed below.Example 1.1. A bent coin has probability f of coming up heads. The coin istossed N times. What is the probability distribution of the number ofheads, r? What are the mean <strong>and</strong> variance of r?Unfamiliar notation?See Appendix A, p.598.Solution. The number of heads has a binomial distribution.( ) NP (r | f, N) = f r (1 − f) N−r . (1.1)rThe mean, E[r], <strong>and</strong> variance, var[r], of this distribution are defined byE[r] ≡N∑P (r | f, N) r (1.2)r=00.30.250.20.150.10.0500 1 2 3 4 5 6 7 8 9 10rFigure 1.1. The binomialdistribution P (r | f = 0.3, N = 10).var[r] ≡ E[(r − E[r]) 2] (1.3)= E[r 2 ] − (E[r]) 2 =N∑P (r | f, N)r 2 − (E[r]) 2 . (1.4)Rather than evaluating the sums over r in (1.2) <strong>and</strong> (1.4) directly, it is easiestto obtain the mean <strong>and</strong> variance by noting that r is the sum of N independentr<strong>and</strong>om variables, namely, the number of heads in the first toss (which is eitherzero or one), the number of heads in the second toss, <strong>and</strong> so forth. In general,r=0E[x + y] = E[x] + E[y] for any r<strong>and</strong>om variables x <strong>and</strong> y;var[x + y] = var[x] + var[y] if x <strong>and</strong> y are independent.(1.5)So the mean of r is the sum of the means of those r<strong>and</strong>om variables, <strong>and</strong> thevariance of r is the sum of their variances. The mean number of heads in asingle toss is f × 1 + (1 − f) × 0 = f, <strong>and</strong> the variance of the number of headsin a single toss is[f × 1 2 + (1 − f) × 0 2] − f 2 = f − f 2 = f(1 − f), (1.6)so the mean <strong>and</strong> variance of r are:E[r] = Nf <strong>and</strong> var[r] = Nf(1 − f). ✷ (1.7)1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!