10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.24.2: Exercises 323statistics, the Bayesian’s best guess for σ sets χ 2 (the measure of deviancedefined by χ 2 ≡ ∑ n (x n − ˆµ) 2 /ˆσ 2 ) equal to the number of degrees of freedom,N − 1.Figure 24.1d shows the posterior probability of σ, which is proportionalto the marginal likelihood. This may be contrasted with the posterior probabilityof σ with µ fixed to its most probable value, ¯x = 1, which is shown infigure 24.1c <strong>and</strong> d.The final inference we might wish to make is ‘given the data, what is µ?’⊲ Exercise 24.2. [3 ] Marginalize over σ <strong>and</strong> obtain the posterior marginal distributionof µ, which is a Student-t distribution:Further readingP (µ | D) ∝ 1/ ( N(µ − ¯x) 2 + S ) N/2. (24.15)A bible of exact marginalization is Bretthorst’s (1988) book on Bayesian spectrumanalysis <strong>and</strong> parameter estimation.24.2 Exercises⊲ Exercise 24.3. [3 ] [This exercise requires macho integration capabilities.] Givea Bayesian solution to exercise 22.15 (p.309), where seven scientists ofvarying capabilities have measured µ with personal noise levels σ n ,<strong>and</strong> we are interested in inferring µ. Let the prior on each σ n be abroad prior, for example a gamma distribution with parameters (s, c) =(10, 0.1). Find the posterior distribution of µ. Plot it, <strong>and</strong> explore itsproperties for a variety of data sets such as the one given, <strong>and</strong> the dataset {x n } = {13.01, 7.39}.[Hint: first find the posterior distribution of σ n given µ <strong>and</strong> x n ,P (σ n | x n , µ). Note that the normalizing constant for this inference isP (x n | µ). Marginalize over σ n to find this normalizing constant, thenuse Bayes’ theorem a second time to find P (µ | {x n }).]AB C D-G-30 -20 -10 0 10 2024.3 SolutionsSolution to exercise 24.1 (p.321). 1. The data points are distributed with meansquared deviation σ 2 about the true mean. 2. The sample mean is unlikelyto exactly equal the true mean. 3. The sample mean is the value of µ thatminimizes the sum squared deviation of the data points from µ. Any othervalue of µ (in particular, the true value of µ) will have a larger value of thesum-squared deviation that µ = ¯x.So the expected mean squared deviation from the sample mean is necessarilysmaller than the mean squared deviation σ 2 about the true mean.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!