10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.314 23 — Useful Probability Distributions10.90.80.70.60.50.40.30.20.100 2 4 6 8 1010.10.80.70.60.50.40.30.20.100.1-4 -2 0 2 4Figure 23.4. Two gammadistributions, with parameters(s, c) = (1, 3) (heavy lines) <strong>and</strong>10, 0.3 (light lines), shown onlinear vertical scales (top) <strong>and</strong>logarithmic vertical scales(bottom); <strong>and</strong> shown as afunction of x on the left (23.15)<strong>and</strong> l = ln x on the right (23.18).0.010.010.0010.0010.00010 2 4 6 8 100.0001-4 -2 0 2 4xl = ln xThis is a simple peaked distribution with mean sc <strong>and</strong> variance s 2 c.It is often natural to represent a positive real variable x in terms of itslogarithm l = ln x. The probability density of l isP (l) = P (x(l))∂x∣ ∂l ∣ = P (x(l))x(l) (23.17)= 1 ( ) x(l) c (exp − x(l) ), (23.18)Z l sswhereZ l = Γ(c). (23.19)[The gamma distribution is named after its normalizing constant – an oddconvention, it seems to me!]Figure 23.4 shows a couple of gamma distributions as a function of x <strong>and</strong>of l. Notice that where the original gamma distribution (23.15) may have a‘spike’ at x = 0, the distribution over l never has such a spike. The spike isan artefact of a bad choice of basis.In the limit sc = 1, c → 0, we obtain the noninformative prior for a scaleparameter, the 1/x prior. This improper prior is called noninformative becauseit has no associated length scale, no characteristic value of x, so it prefers allvalues of x equally. It is invariant under the reparameterization x = mx. Ifwe transform the 1/x probability density into a density over l = ln x we findthe latter density is uniform.⊲ Exercise 23.2. [1 ] Imagine that we reparameterize a positive variable x in termsof its cube root, u = x 1/3 . If the probability density of x is the improperdistribution 1/x, what is the probability density of u?The gamma distribution is always a unimodal density over l = ln x, <strong>and</strong>,as can be seen in the figures, it is asymmetric. If x has a gamma distribution,<strong>and</strong> we decide to work in terms of the inverse of x, v = 1/x, we obtain a newdistribution, in which the density over l is flipped left-for-right: the probabilitydensity of v is called an inverse-gamma distribution,P (v | s, c) = 1 ( ) 1 c+1 (exp − 1 ), 0 ≤ v < ∞ (23.20)Z v svsvwhereZ v = Γ(c)/s. (23.21)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!