06.06.2013 Views

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

660 0 Statistical Mathematics<br />

Markov Chain Monte Carlo<br />

A Markov chain is a stochastic process X0, X1, . . . in which the conditional<br />

distribution <strong>of</strong> Xt given X0, X1, . . ., Xt−1 is the same as the conditional distribution<br />

<strong>of</strong> Xt given only Xt−1. An aperiodic, irreducible, positive recurrent<br />

Markov chain is associated with a stationary distribution or invariant distribution,<br />

which is the limiting distribution <strong>of</strong> the chain. See Section 1.6.3<br />

beginning on page 126 for description <strong>of</strong> these terms.<br />

If the density <strong>of</strong> interest, p, is the density <strong>of</strong> the stationary distribution <strong>of</strong><br />

a Markov chain, correlated samples from the distribution can be generated by<br />

simulating the Markov chain.<br />

This appears harder than it is.<br />

A Markov chain is the basis for several schemes for generating random<br />

samples. The interest is not in the sequence <strong>of</strong> the Markov chain itself.<br />

The elements <strong>of</strong> the chain are accepted or rejected in such a way as to<br />

form a different chain whose stationary distribution or limiting distribution is<br />

the distribution <strong>of</strong> interest.<br />

Convergence<br />

An algorithm based on a stationary distribution <strong>of</strong> a Markov chain is an<br />

iterative method because a sequence <strong>of</strong> operations must be performed until<br />

they converge; that is, until the chain has gone far enough to wash out any<br />

transitory phase that depends on where we start.<br />

Several schemes for assessing convergence have been proposed. For example,<br />

we could use multiple starting points and then use an ANOVA-type test<br />

to compare variances within and across the multiple streams.<br />

The Metropolis Algorithm<br />

For a distribution with density p, the Metropolis algorithm, introduced by<br />

Metropolis et al. (1953) generates a random walk and performs an acceptance/rejection<br />

based on p evaluated at successive steps in the walk.<br />

In the simplest version, the walk moves from the point yi to a candidate<br />

point yi+1 = yi + s, where s is a realization from U(−a, a), and accepts yi+1<br />

if<br />

p(yi+1)<br />

p(yi)<br />

≥ u,<br />

where u is an independent realization from U(0, 1).<br />

This method is also called the “heat bath” method because <strong>of</strong> the context<br />

in which it was introduced.<br />

The random walk <strong>of</strong> Metropolis et al. is the basic algorithm <strong>of</strong> simulated<br />

annealing, which is currently widely used in optimization problems.<br />

If the range <strong>of</strong> the distribution is finite, the random walk is not allowed to<br />

go outside <strong>of</strong> the range.<br />

<strong>Theory</strong> <strong>of</strong> <strong>Statistics</strong> c○2000–2013 James E. Gentle

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!