10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.2.10: Solutions 45Solution to exercise 2.28 (p.38).H(X) = H 2 (f) + fH 2 (g) + (1 − f)H 2 (h). (2.99)Solution to exercise 2.29 (p.38). The probability that there are x−1 tails <strong>and</strong>then one head (so we get the first head on the xth toss) isP (x) = (1 − f) x−1 f. (2.100)If the first toss is a tail, the probability distribution for the future looks justlike it did before we made the first toss. Thus we have a recursive expressionfor the entropy:H(X) = H 2 (f) + (1 − f)H(X). (2.101)Rearranging,H(X) = H 2 (f)/f. (2.102)Solution to exercise 2.34 (p.38).The probability of the number of tails t is( ) 1 t1P (t) =2 2for t ≥ 0. (2.103)The expected number of heads is 1, by definition of the problem. The expectednumber of tails is∞∑( ) 1 t1E[t] = t2 2 , (2.104)t=0which may be shown to be 1 in a variety of ways. For example, since thesituation after one tail is thrown is equivalent to the opening situation, we canwrite down the recurrence relationE[t] = 1 2 (1 + E[t]) + 1 0 ⇒ E[t] = 1. (2.105)20.50.40.30.20.10P ( ˆf)0 0.2 0.4 0.6 0.8 1ˆfThe probability distribution of the ‘estimator’ ˆf = 1/(1 + t), given thatf = 1/2, is plotted in figure 2.12. The probability of ˆf is simply the probabilityof the corresponding value of t.Figure 2.12. The probabilitydistribution of the estimatorˆf = 1/(1 + t), given that f = 1/2.Solution to exercise 2.35 (p.38).(a) The mean number of rolls from one six to the next six is six (assumingwe start counting rolls after the first of the two sixes). The probabilitythat the next six occurs on the rth roll is the probability of not gettinga six for r − 1 rolls multiplied by the probability of then getting a six:( ) 5 r−11P (r 1 = r) = , for r ∈ {1, 2, 3, . . .}. (2.106)6 6This probability distribution of the number of rolls, r, may be called anexponential distribution, sinceP (r 1 = r) = e −αr /Z, (2.107)where α = ln(6/5), <strong>and</strong> Z is a normalizing constant.(b) The mean number of rolls from the clock until the next six is six.(c) The mean number of rolls, going back in time, until the most recent sixis six.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!