01.08.2013 Views

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981<br />

You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.<br />

386 29 — Monte Carlo Methods<br />

when B proposals are concatenated will be the end-point of a r<strong>and</strong>om walk<br />

with B steps in it. This walk might have mean zero, or it might have a<br />

tendency to drift upwards (if most moves increase the energy <strong>and</strong> only a few<br />

decrease it). In general the latter will hold, if the acceptance rate f is small:<br />

the mean change in energy from any one move will be some ∆E > 0 <strong>and</strong> so<br />

the acceptance probability for the concatenation of B moves will be of order<br />

1/(1 + exp(−B∆E)), which scales roughly as f B . The mean-square-distance<br />

moved will be of order f B Bɛ 2 , where ɛ is the typical step size. In contrast,<br />

the mean-square-distance moved when the moves are considered individually<br />

will be of order fBɛ 2 .<br />

1.1<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

1000<br />

10000<br />

100000<br />

theory<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0.2<br />

0<br />

0<br />

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6<br />

1000<br />

10000<br />

100000<br />

theory<br />

Solution to exercise 29.13 (p.382).<br />

drawn from Q. The mean weight is<br />

The weights are w = P (x)/Q(x) <strong>and</strong> x is<br />

<br />

<br />

dx Q(x) [P (x)/Q(x)] = dx P (x) = 1, (29.57)<br />

assuming the integral converges. The variance is<br />

var(w) =<br />

=<br />

=<br />

2 P (x)<br />

dx Q(x) − 1<br />

Q(x)<br />

<br />

P (x)2<br />

dx − 2P (x) + Q(x)<br />

Q(x)<br />

<br />

<br />

−<br />

(29.58)<br />

(29.59)<br />

x2<br />

<br />

2<br />

2 σ2 −<br />

p<br />

1<br />

σ2 <br />

− 1,<br />

q<br />

(29.60)<br />

dx ZQ<br />

Z2 exp<br />

P<br />

where ZQ/Z 2 P = σq/( √ 2πσ2 p ). The integral in (29.60) is finite only if the<br />

coefficient of x2 in the exponent is positive, i.e., if<br />

If this condition is satisfied, the variance is<br />

σ 2 q<br />

var(w) = σq<br />

<br />

√ 2<br />

√ 2π<br />

2πσ2 p σ2 −<br />

p<br />

1<br />

σ2 1<br />

− 2<br />

− 1 =<br />

q<br />

1<br />

><br />

2 σ2 p . (29.61)<br />

σ 2 q<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

<br />

σp 2σ2 q − σ2 − 1. (29.62)<br />

1/2<br />

p<br />

As σq approaches the critical value – about 0.7σp – the variance becomes<br />

infinite. Figure 29.20 illustrates these phenomena for σp = 1 with σq varying<br />

from 0.1 to 1.5. The same r<strong>and</strong>om number seed was used for all runs, so<br />

the weights <strong>and</strong> estimates follow smooth curves. Notice that the empirical<br />

st<strong>and</strong>ard deviation of the R weights can look quite small <strong>and</strong> well-behaved<br />

(say, at σq 0.3) when the true st<strong>and</strong>ard deviation is nevertheless infinite.<br />

Figure 29.20. Importance<br />

sampling in one dimension. For<br />

R = 1000, 10 4 , <strong>and</strong> 10 5 , the<br />

normalizing constant of a<br />

Gaussian distribution (known in<br />

fact to be 1) was estimated using<br />

importance sampling with a<br />

sampler density of st<strong>and</strong>ard<br />

deviation σq (horizontal axis).<br />

The same r<strong>and</strong>om number seed<br />

was used for all runs. The three<br />

plots show (a) the estimated<br />

normalizing constant; (b) the<br />

empirical st<strong>and</strong>ard deviation of<br />

the R weights; (c) 30 of the<br />

weights.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!