Monte Carlo Methods in Statistical Mechanics: Foundations and ...
Monte Carlo Methods in Statistical Mechanics: Foundations and ...
Monte Carlo Methods in Statistical Mechanics: Foundations and ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
is very far from equilibrium. By throw<strong>in</strong>g away the data from the <strong>in</strong>itial transient, we<br />
lose noth<strong>in</strong>g, <strong>and</strong> avoid a potentially large systematic error.<br />
Autocorrelation <strong>in</strong> equilibrium. As expla<strong>in</strong>ed <strong>in</strong> the preced<strong>in</strong>g lecture, the variance<br />
of the sample mean f <strong>in</strong>adynamic <strong>Monte</strong> <strong>Carlo</strong>method is a factor 2 <strong>in</strong>tf higher than<br />
it would be <strong>in</strong> <strong>in</strong>dependent sampl<strong>in</strong>g. Otherwise put, a run oflength n conta<strong>in</strong>s only<br />
n=2 <strong>in</strong>tf \e ectively <strong>in</strong>dependent data po<strong>in</strong>ts".<br />
This has several implications for <strong>Monte</strong> <strong>Carlo</strong> work. On the oneh<strong>and</strong>, it means<br />
that the the computational e ciency of the algorithm is determ<strong>in</strong>ed pr<strong>in</strong>cipally by its<br />
autocorrelation time. More precisely, ifonewishes to compare two alternative <strong>Monte</strong><br />
<strong>Carlo</strong> algorithms for the same problem, then the better algorithm is the one that has<br />
the smaller autocorrelation time, when time is measured <strong>in</strong> units of computer (CPU)<br />
time. [In general there may arise tradeo s between \physical" autocorrelation time (i.e.<br />
measured <strong>in</strong> iterations) <strong>and</strong> computational complexity per iteration.] So accurate<br />
measurements ofthe autocorrelation time are essential to evaluat<strong>in</strong>g the computational<br />
e ciency of compet<strong>in</strong>g algorithms.<br />
On the other h<strong>and</strong>, even for a xed algorithm, knowledge of <strong>in</strong>tf is essential for<br />
determ<strong>in</strong><strong>in</strong>g run lengths | is a run of 100000 sweeps long enough? | <strong>and</strong> for sett<strong>in</strong>g<br />
error bars on estimates of hfi. Roughly speak<strong>in</strong>g, error bars will be of order ( =n) 1=2 <br />
so if we want 1% accuracy, then we need a run oflength 10000 ,<strong>and</strong>soon. Above<br />
all, there is a basic self-consistency requirement: the runlength n must be than the<br />
estimates of produced by that samerun, otherwise none of the results fromthat run<br />
should be believed. Of course, while self-consistency is a necessary condition for the<br />
trustworth<strong>in</strong>ess of <strong>Monte</strong> <strong>Carlo</strong> data, it is not a su cient condition there is always the<br />
danger of metastability.<br />
Already we can draw a conclusion about the relative importance of <strong>in</strong>itialization<br />
bias <strong>and</strong> autocorrelation as di culties <strong>in</strong> dynamic <strong>Monte</strong> <strong>Carlo</strong> work. Let us assume<br />
that the time for <strong>in</strong>itial convergence to equilibrium is comparable to (orat least not too<br />
much larger than) the equilibrium autocorrelation time <strong>in</strong>tf (for the observables f of<br />
<strong>in</strong>terest) | thisisoften but notalways the case. Then <strong>in</strong>itialization bias is a relatively<br />
trivial problem compared to autocorrelation <strong>in</strong> equilibrium. To elim<strong>in</strong>ate <strong>in</strong>itialization<br />
bias, it su ces to discard 20 of the data atthe beg<strong>in</strong>n<strong>in</strong>g ofthe run but toachieve<br />
a reasonably small statistical error, it is necessary to make arunoflength 1000 or<br />
more. So the data that must be discarded at the beg<strong>in</strong>n<strong>in</strong>g, ndisc, isanegligible fraction<br />
of the total run length n. This estimate alsoshows that the exact value of ndisc is not<br />
particularly delicate: anyth<strong>in</strong>g between 20 <strong>and</strong> n=5 will elim<strong>in</strong>ate essentially all<br />
<strong>in</strong>itialization bias while pay<strong>in</strong>g less than a 10% price <strong>in</strong> the nal error bars.<br />
In this rema<strong>in</strong>der of this lecture I would like to discuss <strong>in</strong> more detail the statistical<br />
analysis of dynamic <strong>Monte</strong> <strong>Carlo</strong> data (assumed to be already \<strong>in</strong> equilibrium"), with<br />
emphasis on how toestimate the autocorrelation time <strong>in</strong>tf <strong>and</strong> how tocompute valid<br />
error bars. What is<strong>in</strong>volved here is a branch ofmathematical statistics called timeseries<br />
analysis. An excellent exposition can be found <strong>in</strong>the books of of Priestley [14]<br />
<strong>and</strong> Anderson [15].<br />
13