10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.354 28 — Model Comparison <strong>and</strong> Occam’s Razormeaningless. Only when one has no intention to do model comparison mayit be safe to use improper priors, <strong>and</strong> even in such cases there are pitfalls, asDawid et al. explain. I would agree with their advice to always use properpriors, tempered by an encouragement to be smart when making calculations,recognizing opportunities for approximation.28.4 ExercisesP (x | m, H 1 ) = 1 (1 + mx) x ∈ (−1, 1). (28.21)2Exercise 28.1. [3 ] R<strong>and</strong>om variables x come independently from a probabilitydistribution P (x). According to model H 0 , P (x) is a uniform distributionP (x | H 0 ) = 1 x ∈ (−1, 1). (28.20)2According to model H 1 , P (x) is a nonuniform distribution with an un-known parameter m ∈ (−1, 1):P (x | H 0 )−1 1 xP (x | m=−0.4, H 1 )−1 1 xGiven the data D = {0.3, 0.5, 0.7, 0.8, 0.9}, what is the evidence for H 0<strong>and</strong> H 1 ?Exercise 28.2. [3 ] Datapoints (x, t) are believed to come from a straight line.The experimenter chooses x, <strong>and</strong> t is Gaussian-distributed abouty = w 0 + w 1 x (28.22)y = w 0 + w 1 xwith variance σν. 2 According to model H 1 , the straight line is horizontal,so w 1 = 0. According to model H 2 , w 1 is a parameter with prior distributionNormal(0, 1). Both models assign a prior distribution Normal(0, 1)to w 0 . Given the data set D = {(−8, 8), (−2, 10), (6, 11)}, <strong>and</strong> assumingthe noise level is σ ν = 1, what is the evidence for each model?xExercise 28.3. [3 ] A six-sided die is rolled 30 times <strong>and</strong> the numbers of timeseach face came up were F = {3, 3, 2, 2, 9, 11}. What is the probabilitythat the die is a perfectly fair die (‘H 0 ’), assuming the alternative hypothesisH 1 says that the die has a biased distribution p, <strong>and</strong> the priordensity for p is uniform over the simplex p i ≥ 0, ∑ i p i = 1?Solve this problem two ways: exactly, using the helpful Dirichlet formulae(23.30, 23.31), <strong>and</strong> approximately, using Laplace’s method. Noticethat your choice of basis for the Laplace approximation is important.See MacKay (1998a) for discussion of this exercise.Exercise 28.4. [3 ] The influence of race on the imposition of the death penaltyfor murder in America has been much studied. The following three-waytable classifies 326 cases in which the defendant was convicted of murder.The three variables are the defendant’s race, the victim’s race, <strong>and</strong>whether the defendant was sentenced to death. (Data from M. Radelet,‘Racial characteristics <strong>and</strong> imposition of the death penalty,’ AmericanSociological Review, 46 (1981), pp. 918-927.)White defendantBlack defendantDeath penaltyDeath penaltyYes No Yes NoWhite victim 19 132 White victim 11 52Black victim 0 9 Black victim 6 97

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!