10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.358 29 — Monte Carlo Methods32.521.510.5P*(x)32.521.510.5P*(x)Figure 29.1. (a) The functionP ∗ (x) =exp [ 0.4(x − 0.4) 2 − 0.08x 4] . Howto draw samples from thisdensity? (b) The function P ∗ (x)evaluated at a discrete set ofuniformly spaced points {x i }.How to draw samples from thisdiscrete distribution?0(a)-4 -2 0 2 40(b)-4 -2 0 2 4This is one of the important properties of Monte Carlo methods.The accuracy of the Monte Carlo estimate (29.6) depends only onthe variance of φ, not on the dimensionality of the space sampled.To be precise, the variance of ˆΦ goes as σ 2 /R. So regardless of thedimensionality of x, it may be that as few as a dozen independentsamples {x (r) } suffice to estimate Φ satisfactorily.We will find later, however, that high dimensionality can cause other difficultiesfor Monte Carlo methods. Obtaining independent samples from a givendistribution P (x) is often not easy.Why is sampling from P (x) hard?We will assume that the density from which we wish to draw samples, P (x),can be evaluated, at least to within a multiplicative constant; that is, we canevaluate a function P ∗ (x) such thatP (x) = P ∗ (x)/Z. (29.8)If we can evaluate P ∗ (x), why can we not easily solve problem 1? Why is it ingeneral difficult to obtain samples from P (x)? There are two difficulties. Thefirst is that we typically do not know the normalizing constant∫Z = d N x P ∗ (x). (29.9)The second is that, even if we did know Z, the problem of drawing samplesfrom P (x) is still a challenging one, especially in high-dimensional spaces,because there is no obvious way to sample from P without enumerating mostor all of the possible states. Correct samples from P will by definition tendto come from places in x-space where P (x) is big; how can we identify thoseplaces where P (x) is big, without evaluating P (x) everywhere? There are onlya few high-dimensional densities from which it is easy to draw samples, forexample the Gaussian distribution.Let us start with a simple one-dimensional example. Imagine that we wishto draw samples from the density P (x) = P ∗ (x)/Z whereP ∗ (x) = exp [ 0.4(x − 0.4) 2 − 0.08x 4] , x ∈ (−∞, ∞). (29.10)We can plot this function (figure 29.1a). But that does not mean we can drawsamples from it. To start with, we don’t know the normalizing constant Z.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!