10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.342 27 — Laplace’s MethodThe fact that the normalizing constant of a Gaussian is given by∫ [d K x exp − 1 ] √(2π)K2 xT Ax =det A(27.9)can be proved by making an orthogonal transformation into the basis u in whichA is transformed into a diagonal matrix. The integral then separates into aproduct of one-dimensional integrals, each of the form∫du i exp[− 1 2 λ iu 2 i]=The product of the eigenvalues λ i is the determinant of A.√2πλ i. (27.10)The Laplace approximation is basis-dependent: if x is transformed to anonlinear function u(x) <strong>and</strong> the density is transformed to P (u) = P (x) |dx/du|then in general the approximate normalizing constants Z Q will be different.This can be viewed as a defect – since the true value Z P is basis-independent– or an opportunity – because we can hunt for a choice of basis in which theLaplace approximation is most accurate.27.1 ExercisesExercise 27.1. [2 ] (See also exercise 22.8 (p.307).) A photon counter is pointedat a remote star for one minute, in order to infer the rate of photonsarriving at the counter per minute, λ. Assuming the number of photonscollected r has a Poisson distribution with mean λ,P (r | λ) = exp(−λ) λrr! , (27.11)<strong>and</strong> assuming the improper prior P (λ) = 1/λ, make Laplace approximationsto the posterior distribution(a) over λ(b) over log λ. [Note the improper prior transforms to P (log λ) =constant.]⊲ Exercise 27.2. [2 ] Use Laplace’s method to approximate the integralZ(u 1 , u 2 ) =∫ ∞−∞da f(a) u1 (1 − f(a)) u2 , (27.12)where f(a) = 1/(1 + e −a ) <strong>and</strong> u 1 , u 2 are positive. Check the accuracy ofthe approximation against the exact answer (23.29, p.316) for (u 1 , u 2 ) =( 1/ 2, 1/ 2) <strong>and</strong> (u 1 , u 2 ) = (1, 1). Measure the error (log Z P − log Z Q ) inbits.⊲ Exercise 27.3. [3 ] Linear regression. N datapoints {(x (n) , t (n) )} are generated bythe experimenter choosing each x (n) , then the world delivering a noisyversion of the linear functiony(x) = w 0 + w 1 x, (27.13)t (n) ∼ Normal(y(x (n) ), σ 2 ν). (27.14)Assuming Gaussian priors on w 0 <strong>and</strong> w 1 , make the Laplace approximationto the posterior distribution of w 0 <strong>and</strong> w 1 (which is exact, in fact)<strong>and</strong> obtain the predictive distribution for the next datapoint t (N+1) , givenx (N+1) .(See MacKay (1992a) for further reading.)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!