10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.370 29 — Monte Carlo Methods(a)x 2P (x)x 1x 2(b)x (t)P (x 1 | x (t)2 )x 1Figure 29.13. Gibbs sampling.(a) The joint density P (x) fromwhich samples are required. (b)Starting from a state x (t) , x 1 issampled from the conditionaldensity P (x 1 | x (t)2 ). (c) A sampleis then made from the conditionaldensity P (x 2 | x 1 ). (d) A couple ofiterations of Gibbs sampling.x 2P (x 2 | x 1 )x 2x (t+1)x (t+2)(c)x 1(d)x (t)x 1This is good news <strong>and</strong> bad news. It is good news because, unlike thecases of rejection sampling <strong>and</strong> importance sampling, there is no catastrophicdependence on the dimensionality N. Our computer will give useful answersin a time shorter than the age of the universe. But it is bad news all the same,because this quadratic dependence on the lengthscale-ratio may still force usto make very lengthy simulations.Fortunately, there are methods for suppressing r<strong>and</strong>om walks in MonteCarlo simulations, which we will discuss in the next chapter.29.5 Gibbs samplingWe introduced importance sampling, rejection sampling <strong>and</strong> the Metropolismethod using one-dimensional examples. Gibbs sampling, also known as theheat bath method or ‘Glauber dynamics’, is a method for sampling from distributionsover at least two dimensions. Gibbs sampling can be viewed as aMetropolis method in which a sequence of proposal distributions Q are definedin terms of the conditional distributions of the joint distribution P (x). It isassumed that, whilst P (x) is too complex to draw samples from directly, itsconditional distributions P (x i | {x j } j≠i ) are tractable to work with. For manygraphical models (but not all) these one-dimensional conditional distributionsare straightforward to sample from. For example, if a Gaussian distributionfor some variables d has an unknown mean m, <strong>and</strong> the prior distribution of mis Gaussian, then the conditional distribution of m given d is also Gaussian.Conditional distributions that are not of st<strong>and</strong>ard form may still be sampledfrom by adaptive rejection sampling if the conditional distribution satisfiescertain convexity properties (Gilks <strong>and</strong> Wild, 1992).Gibbs sampling is illustrated for a case with two variables (x 1 , x 2 ) = xin figure 29.13. On each iteration, we start from the current state x (t) , <strong>and</strong>x 1 is sampled from the conditional density P (x 1 | x 2 ), with x 2 fixed to x (t)2 .A sample x 2 is then made from the conditional density P (x 2 | x 1 ), using the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!