10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.22.1: Maximum likelihood for one Gaussian 3010.060.050.040.030.020.011 00.80.6sigma0.40.24.543.5(a1)00.51mean1.5sigma=0.2sigma=0.4sigma=0.620 0.5 1 1.5 2(a2)mean0.090.080.071mu=1mu=1.25mu=1.50.90.80.70.6sigma0.50.40.30.20.1Figure 22.1. The likelihoodfunction for the parameters of aGaussian distribution.(a1, a2) Surface plot <strong>and</strong> contourplot of the log likelihood as afunction of µ <strong>and</strong> σ. The data setof N = 5 points had mean ¯x = 1.0<strong>and</strong> S = ∑ (x − ¯x) 2 = 1.0.(b) The posterior probability of µfor various values of σ.(c) The posterior probability of σfor various fixed values of µ(shown as a density over ln σ).30.06Posterior2.520.050.041.50.0310.02(b)0.500 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2mean(c)0.0100.2 0.4 0.6 0.8 1 1.2 1.4 1.61.8 2If we Taylor-exp<strong>and</strong> the log likelihood about the maximum, we can defineapproximate error bars on the maximum likelihood parameter: we usea quadratic approximation to estimate how far from the maximum-likelihoodparameter setting we can go before the likelihood falls by some st<strong>and</strong>ard factor,for example e 1/2 , or e 4/2 . In the special case of a likelihood that is aGaussian function of the parameters, the quadratic approximation is exact.Example 22.2. Find the second derivative of the log likelihood with respect toµ, <strong>and</strong> find the error bars on µ, given the data <strong>and</strong> σ.Solution.∂ 2∂µ 2 ln P = − N σ 2 . ✷ (22.7)Comparing this curvature with the curvature of the log of a Gaussian distributionover µ of st<strong>and</strong>ard deviation σ µ , exp(−µ 2 /(2σ 2 µ)), which is −1/σ 2 µ, wecan deduce that the error bars on µ (derived from the likelihood function) areσ µ =σ √N. (22.8)The error bars have this property: at the two points µ = ¯x±σ µ , the likelihoodis smaller than its maximum value by a factor of e 1/2 .Example 22.3. Find the maximum likelihood st<strong>and</strong>ard deviation σ of a Gaussian,whose mean is known to be µ, in the light of data {x n } N n=1 . Findthe second derivative of the log likelihood with respect to ln σ, <strong>and</strong> errorbars on ln σ.Solution. The likelihood’s dependence on σ isln P ({x n } N n=1 | µ, σ) = −N ln( √ 2πσ) − S tot(2σ 2 ) , (22.9)where S tot = ∑ n (x n − µ) 2 . To find the maximum of the likelihood, we c<strong>and</strong>ifferentiate with respect to ln σ. [It’s often most hygienic to differentiate with

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!