10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.40 2 — Probability, Entropy, <strong>and</strong> <strong>Inference</strong>⊲ Exercise 2.39. [3C, p.46] The frequency p n of the nth most frequent word inEnglish is roughly approximated byp n ≃{ 0.1nfor n ∈ 1, . . . , 12 3670 n > 12 367.(2.56)[This remarkable 1/n law is known as Zipf’s law, <strong>and</strong> applies to the wordfrequencies of many languages (Zipf, 1949).] If we assume that Englishis generated by picking words at r<strong>and</strong>om according to this distribution,what is the entropy of English (per word)? [This calculation can be foundin ‘Prediction <strong>and</strong> entropy of printed English’, C.E. Shannon, Bell Syst.Tech. J. 30, pp.50–64 (1950), but, inexplicably, the great man madenumerical errors in it.]2.10 SolutionsSolution to exercise 2.2 (p.24). No, they are not independent. If they werethen all the conditional distributions P (y | x) would be identical functions ofy, regardless of x (cf. figure 2.3).Solution to exercise 2.4 (p.27).We define the fraction f B ≡ B/K.(a) The number of black balls has a binomial distribution.P (n B | f B , N) =( Nn B)f n BB (1 − f B) N−n B. (2.57)(b) The mean <strong>and</strong> variance of this distribution are:E[n B ] = Nf B (2.58)var[n B ] = Nf B (1 − f B ). (2.59)These results were derived in example 1.1 (p.1). The st<strong>and</strong>ard deviationof n B is √ var[n B ] = √ Nf B (1 − f B ).When B/K = 1/5 <strong>and</strong> N = 5, the expectation <strong>and</strong> variance of n B are 1<strong>and</strong> 4/5. The st<strong>and</strong>ard deviation is 0.89.When B/K = 1/5 <strong>and</strong> N = 400, the expectation <strong>and</strong> variance of n B are80 <strong>and</strong> 64. The st<strong>and</strong>ard deviation is 8.Solution to exercise 2.5 (p.27).The numerator of the quantityz = (n B − f B N) 2Nf B (1 − f B )can be recognized as (n B − E[n B ]) 2 ; the denominator is equal to the varianceof n B (2.59), which is by definition the expectation of the numerator. So theexpectation of z is 1. [A r<strong>and</strong>om variable like z, which measures the deviationof data from the expected value, is sometimes called χ 2 (chi-squared).]In the case N = 5 <strong>and</strong> f B = 1/5, Nf B is 1, <strong>and</strong> var[n B ] is 4/5. Thenumerator has five possible values, only one of which is smaller than 1: (n B −f B N) 2 = 0 has probability P (n B = 1) = 0.4096; so the probability that z < 1is 0.4096.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!