10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.8.4: Solutions 143Solution to exercise 8.3 (p.140). The chain rule for entropy follows from thedecomposition of a joint probability:H(X, Y ) = ∑ 1P (x, y) log(8.20)P (x, y)xy= ∑ []1P (x)P (y | x) logP (x) + log 1(8.21)P (y | x)xy= ∑ 1P (x) logP (x) + ∑ P (x) ∑ 1P (y | x) logP (y | x) (8.22)xx y= H(X) + H(Y | X). (8.23)Solution to exercise 8.4 (p.140).Symmetry of mutual information:I(X; Y ) = H(X) − H(X | Y ) (8.24)= ∑ 1P (x) logP (x) − ∑ 1P (x, y) log(8.25)P (x | y)xxy= ∑ xy= ∑ xyP (x, y) logP (x, y) logP (x | y)P (x)This expression is symmetric in x <strong>and</strong> y so(8.26)P (x, y)P (x)P (y) . (8.27)I(X; Y ) = H(X) − H(X | Y ) = H(Y ) − H(Y | X). (8.28)We can prove that mutual information is positive in two ways.continue fromI(X; Y ) = ∑ P (x, y)P (x, y) logP (x)P (y)x,yOne is to(8.29)which is a relative entropy <strong>and</strong> use Gibbs’ inequality (proved on p.44), whichasserts that this relative entropy is ≥ 0, with equality only if P (x, y) =P (x)P (y), that is, if X <strong>and</strong> Y are independent.The other is to use Jensen’s inequality on− ∑ x,yP (x, y) logP (x)P (y)P (x, y)≥ − log ∑ x,ySolution to exercise 8.7 (p.141). z = x + y mod 2.P (x, y)P (x)P (y) = log 1 = 0. (8.30)P (x, y)(a) If q = 1/ 2, P Z = { 1/ 2, 1/ 2} <strong>and</strong> I(Z; X) = H(Z) − H(Z | X) = 1 − 1 = 0.(b) For general q <strong>and</strong> p, P Z = {pq+(1−p)(1−q), p(1−q)+q(1−p)}. The mutualinformation is I(Z; X) = H(Z)−H(Z | X) = H 2 (pq+(1−p)(1−q))−H 2 (q).Three term entropiesSolution to exercise 8.8 (p.141). The depiction of entropies in terms of Venndiagrams is misleading for at least two reasons.First, one is used to thinking of Venn diagrams as depicting sets; but whatare the ‘sets’ H(X) <strong>and</strong> H(Y ) depicted in figure 8.2, <strong>and</strong> what are the objectsthat are members of those sets? I think this diagram encourages the novicestudent to make inappropriate analogies. For example, some students imagine

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!