10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.328 25 — Exact Marginalization in Trellisest Likelihood Posterior probability0000000 0.0275562 0.250001011 0.0001458 0.00130010111 0.0013122 0.0120011100 0.0030618 0.0270100110 0.0002268 0.00200101101 0.0000972 0.00090110001 0.0708588 0.630111010 0.0020412 0.0181000101 0.0001458 0.00131001110 0.0000042 0.00001010010 0.0030618 0.0271011001 0.0013122 0.0121100011 0.0000972 0.00091101000 0.0002268 0.00201110100 0.0020412 0.0181111111 0.0000108 0.0001Figure 25.2. Posterior probabilitiesover the sixteen codewords whenthe received vector y hasnormalized likelihoods(0.1, 0.4, 0.9, 0.1, 0.1, 0.1, 0.3).Let the normalized likelihoods be: (0.1, 0.4, 0.9, 0.1, 0.1, 0.1, 0.3). That is,the ratios of the likelihoods areP (y 1 | x 1 = 1)P (y 1 | x 1 = 0) = 0.10.9 , P (y 2 | x 2 = 1)P (y 2 | x 2 = 0) = 0.4 , etc. (25.12)0.6How should this received signal be decoded?1. If we threshold the likelihoods at 0.5 to turn the signal into a binaryreceived vector, we have r = (0, 0, 1, 0, 0, 0, 0), which decodes,using the decoder for the binary symmetric channel (Chapter 1), intoˆt = (0, 0, 0, 0, 0, 0, 0).This is not the optimal decoding procedure.always obtained by using Bayes’ theorem.Optimal inferences are2. We can find the posterior probability over codewords by explicit enumerationof all sixteen codewords. This posterior distribution is shownin figure 25.2. Of course, we aren’t really interested in such brute-forcesolutions, <strong>and</strong> the aim of this chapter is to underst<strong>and</strong> algorithms forgetting the same information out in less than 2 K computer time.Examining the posterior probabilities, we notice that the most probablecodeword is actually the string t = 0110001. This is more than twice asprobable as the answer found by thresholding, 0000000.Using the posterior probabilities shown in figure 25.2, we can also computethe posterior marginal distributions of each of the bits. The resultis shown in figure 25.3. Notice that bits 1, 4, 5 <strong>and</strong> 6 are all quite confidentlyinferred to be zero. The strengths of the posterior probabilitiesfor bits 2, 3, <strong>and</strong> 7 are not so great.✷In the above example, the MAP codeword is in agreement with the bitwisedecoding that is obtained by selecting the most probable state for each bitusing the posterior marginal distributions. But this is not always the case, asthe following exercise shows.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!