01.08.2013 Views

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981<br />

You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.<br />

150 9 — Communication over a Noisy Channel<br />

What is H(Y |X)? It is defined to be the weighted sum over x of H(Y | x);<br />

but H(Y | x) is the same for each value of x: H(Y | x = 0) is H2(0.15),<br />

<strong>and</strong> H(Y | x = 1) is H2(0.15). So<br />

I(X; Y ) = H(Y ) − H(Y |X)<br />

= H2(0.22) − H2(0.15)<br />

= 0.76 − 0.61 = 0.15 bits. (9.8)<br />

This may be contrasted with the entropy of the source H(X) =<br />

H2(0.1) = 0.47 bits.<br />

Note: here we have used the binary entropy function H2(p) ≡ H(p, 1−<br />

p) = p log 1<br />

p + (1 − p) log 1<br />

(1−p)<br />

. Throughout this book, log means<br />

log 2 .<br />

Example 9.6. And now the Z channel, with PX as above. P (y = 1) = 0.085.<br />

I(X; Y ) = H(Y ) − H(Y |X)<br />

= H2(0.085) − [0.9H2(0) + 0.1H2(0.15)]<br />

= 0.42 − (0.1 × 0.61) = 0.36 bits. (9.9)<br />

The entropy of the source, as above, is H(X) = 0.47 bits. Notice that<br />

the mutual information I(X; Y ) for the Z channel is bigger than the<br />

mutual information for the binary symmetric channel with the same f.<br />

The Z channel is a more reliable channel.<br />

Exercise 9.7. [1, p.157] Compute the mutual information between X <strong>and</strong> Y for<br />

the binary symmetric channel with f = 0.15 when the input distribution<br />

is PX = {p0 = 0.5, p1 = 0.5}.<br />

Exercise 9.8. [2, p.157] Compute the mutual information between X <strong>and</strong> Y for<br />

the Z channel with f = 0.15 when the input distribution is PX :<br />

{p0 = 0.5, p1 = 0.5}.<br />

Maximizing the mutual information<br />

We have observed in the above examples that the mutual information between<br />

the input <strong>and</strong> the output depends on the chosen input ensemble.<br />

Let us assume that we wish to maximize the mutual information conveyed<br />

by the channel by choosing the best possible input ensemble. We define the<br />

capacity of the channel to be its maximum mutual information.<br />

The capacity of a channel Q is:<br />

C(Q) = max<br />

PX<br />

I(X; Y ). (9.10)<br />

The distribution PX that achieves the maximum is called the optimal<br />

input distribution, denoted by P ∗ X . [There may be multiple optimal<br />

input distributions achieving the same value of I(X; Y ).]<br />

In Chapter 10 we will show that the capacity does indeed measure the maximum<br />

amount of error-free information that can be transmitted over the channel<br />

per unit time.<br />

Example 9.9. Consider the binary symmetric channel with f = 0.15. Above,<br />

we considered PX = {p0 = 0.9, p1 = 0.1}, <strong>and</strong> found I(X; Y ) = 0.15 bits.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!