10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.9.9: Solutions 159x (1) x (2) (c) 11x (1) x (2) ✲ ✲✲ ✲✲✲ ✲ ˆm = 20?Q 10 100?0100???1?01?1(a) 1100100111N = 1 N = 200?0100???1?01?1(b) 110010011100?0100???1?01?100100111ˆm = 1ˆm = 1ˆm = 1ˆm = 0ˆm = 2ˆm = 2Figure 9.10. (a) The extendedchannel (N = 2) obtained from abinary erasure channel witherasure probability 0.15. (b) Ablock code consisting of the twocodewords 00 <strong>and</strong> 11. (c) Theoptimal decoder for this code.Solution to exercise 9.14 (p.153). The extended channel is shown in figure9.10. The best code for this channel with N = 2 is obtained by choosingtwo columns that have minimal overlap, for example, columns 00 <strong>and</strong> 11. Thedecoding algorithm returns ‘00’ if the extended channel output is among thetop four <strong>and</strong> ‘11’ if it’s among the bottom four, <strong>and</strong> gives up if the output is‘??’.Solution to exercise 9.15 (p.155). In example 9.11 (p.151) we showed that themutual information between input <strong>and</strong> output of the Z channel isI(X; Y ) = H(Y ) − H(Y | X)= H 2 (p 1 (1 − f)) − p 1 H 2 (f). (9.47)We differentiate this expression with respect to p 1 , taking care not to confuselog 2 with log e :d1 − p 1 (1 − f)I(X; Y ) = (1 − f) logdp 2 − H 2 (f). (9.48)1 p 1 (1 − f)Setting this derivative to zero <strong>and</strong> rearranging using skills developed in exercise2.17 (p.36), we obtain:so the optimal input distribution isp ∗ 11(1 − f) =, (9.49)H2(f)/(1−f)1 + 2p ∗ 1 = 1/(1 − f). (9.50)(H2(f)/(1−f))1 + 2As the noise level f tends to 1, this expression tends to 1/e (as you can proveusing L’Hôpital’s rule).For all values of f, p ∗ 1 is smaller than 1/2. A rough intuition for why input1 is used less than input 0 is that when input 1 is used, the noisy channelinjects entropy into the received string; whereas when input 0 is used, thenoise has zero entropy.Solution to exercise 9.16 (p.155). The capacities of the three channels areshown in figure 9.11. For any f < 0.5, the BEC is the channel with highestcapacity <strong>and</strong> the BSC the lowest.10.90.80.70.60.50.40.30.20.1ZBSCBEC00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Figure 9.11. Capacities of the Zchannel, binary symmetricchannel, <strong>and</strong> binary erasurechannel.Solution to exercise 9.18 (p.155).ratio, given y, isThe logarithm of the posterior probabilitya(y) = lnP (x = 1 | y, α, σ) Q(y | x = 1, α, σ)= lnP (x = − 1 | y, α, σ) Q(y | x = − 1, α, σ) = 2αy σ 2 . (9.51)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!