10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.6 1 — Introduction to <strong>Information</strong> <strong>Theory</strong>Received sequence r Likelihood ratioP (r | s = 1)P (r | s = 0)Decoded sequence ŝ000 γ −3 0001 γ −1 0010 γ −1 0100 γ −1 0101 γ 1 1110 γ 1 1011 γ 1 1111 γ 3 1Algorithm 1.9. Majority-votedecoding algorithm for R 3 . Alsoshown are the likelihood ratios(1.23), assuming the channel is abinary symmetric channel;γ ≡ (1 − f)/f.At the risk of explaining the obvious, let’s prove this result. The optimaldecoding decision (optimal in the sense of having the smallest probability ofbeing wrong) is to find which value of s is most probable, given r. Considerthe decoding of a single bit s, which was encoded as t(s) <strong>and</strong> gave rise to threereceived bits r = r 1 r 2 r 3 . By Bayes’ theorem, the posterior probability of s isP (s | r 1 r 2 r 3 ) = P (r 1r 2 r 3 | s)P (s). (1.18)P (r 1 r 2 r 3 )We can spell out the posterior probability of the two alternatives thus:P (s = 1 | r 1 r 2 r 3 ) = P (r 1r 2 r 3 | s = 1)P (s = 1); (1.19)P (r 1 r 2 r 3 )P (s = 0 | r 1 r 2 r 3 ) = P (r 1r 2 r 3 | s = 0)P (s = 0). (1.20)P (r 1 r 2 r 3 )This posterior probability is determined by two factors: the prior probabilityP (s), <strong>and</strong> the data-dependent term P (r 1 r 2 r 3 | s), which is called the likelihoodof s. The normalizing constant P (r 1 r 2 r 3 ) needn’t be computed when finding theoptimal decoding decision, which is to guess ŝ = 0 if P (s = 0 | r) > P (s = 1 | r),<strong>and</strong> ŝ = 1 otherwise.To find P (s = 0 | r) <strong>and</strong> P (s = 1 | r), we must make an assumption about theprior probabilities of the two hypotheses s = 0 <strong>and</strong> s = 1, <strong>and</strong> we must make anassumption about the probability of r given s. We assume that the prior probabilitiesare equal: P (s = 0) = P (s = 1) = 0.5; then maximizing the posteriorprobability P (s | r) is equivalent to maximizing the likelihood P (r | s). And weassume that the channel is a binary symmetric channel with noise level f < 0.5,so that the likelihood isP (r | s) = P (r | t(s)) =N∏P (r n | t n (s)), (1.21)where N = 3 is the number of transmitted bits in the block we are considering,<strong>and</strong>{ (1−f) if rn = tP (r n | t n ) =n(1.22)f if r n ≠ t n .n=1Thus the likelihood ratio for the two hypotheses isP (r | s = 1)NP (r | s = 0) = ∏ P (r n | t n (1))P (r n | t n (0)) ; (1.23)equals(1−f)fn=1each factor P (rn|tn(1))fP (r n|t n(0))if r n = 1 <strong>and</strong>(1−f) if r n = 0. The ratioγ ≡ (1−f)fis greater than 1, since f < 0.5, so the winning hypothesis is theone with the most ‘votes’, each vote counting for a factor of γ in the likelihoodratio.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!