10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.386 29 — Monte Carlo Methodswhen B proposals are concatenated will be the end-point of a r<strong>and</strong>om walkwith B steps in it. This walk might have mean zero, or it might have atendency to drift upwards (if most moves increase the energy <strong>and</strong> only a fewdecrease it). In general the latter will hold, if the acceptance rate f is small:the mean change in energy from any one move will be some ∆E > 0 <strong>and</strong> sothe acceptance probability for the concatenation of B moves will be of order1/(1 + exp(−B∆E)), which scales roughly as f B . The mean-square-distancemoved will be of order f B Bɛ 2 , where ɛ is the typical step size. In contrast,the mean-square-distance moved when the moves are considered individuallywill be of order fBɛ 2 .1.110.90.80.70.60.50.40.3100010000100000theory1086420.2000 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6100010000100000theory3.532.521.510.5Solution to exercise 29.13 (p.382). The weights are w = P (x)/Q(x) <strong>and</strong> x isdrawn from Q. The mean weight is∫∫dx Q(x) [P (x)/Q(x)] = dx P (x) = 1, (29.57)assuming the integral converges. The variance is∫ [ ] P (x) 2var(w) = dx Q(x)Q(x) − 1 (29.58)∫= dx P (x)2 − 2P (x) + Q(x) (29.59)Q(x)[∫= dx Z (QZP2 exp(− x2 22 σp2 − 1 ))]σq2 − 1, (29.60)Figure 29.20. Importancesampling in one dimension. ForR = 1000, 10 4 , <strong>and</strong> 10 5 , thenormalizing constant of aGaussian distribution (known infact to be 1) was estimated usingimportance sampling with asampler density of st<strong>and</strong>arddeviation σ q (horizontal axis).The same r<strong>and</strong>om number seedwas used for all runs. The threeplots show (a) the estimatednormalizing constant; (b) theempirical st<strong>and</strong>ard deviation ofthe R weights; (c) 30 of theweights.where Z Q /ZP 2 = σ q/( √ 2πσp 2 ). The integral in (29.60) is finite only if thecoefficient of x 2 in the exponent is positive, i.e., ifIf this condition is satisfied, the variance isvar(w) =σ 2 q > 1 2 σ2 p . (29.61)σ (q√ 2√ 2π2πσ 2 pσp2 − 1 ) −12 σq2 − 1 =σq2 ( )σ p 2σ 2 q − σp2 1/2− 1. (29.62)As σ q approaches the critical value – about 0.7σ p – the variance becomesinfinite. Figure 29.20 illustrates these phenomena for σ p = 1 with σ q varyingfrom 0.1 to 1.5. The same r<strong>and</strong>om number seed was used for all runs, sothe weights <strong>and</strong> estimates follow smooth curves. Notice that the empiricalst<strong>and</strong>ard deviation of the R weights can look quite small <strong>and</strong> well-behaved(say, at σ q ≃ 0.3) when the true st<strong>and</strong>ard deviation is nevertheless infinite.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!