The HMC Algorithm with Overrelaxation and Adaptive--Step ...
The HMC Algorithm with Overrelaxation and Adaptive--Step ...
The HMC Algorithm with Overrelaxation and Adaptive--Step ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>The</strong> <strong>HMC</strong> <strong>Algorithm</strong> <strong>with</strong> <strong>Overrelaxation</strong> <strong>and</strong><br />
<strong>Adaptive</strong>–<strong>Step</strong> Discretization<br />
Numerical Experiments <strong>with</strong> Gaussian Targets<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong><br />
Thiele Conference<br />
July 17, 2008<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Bayes <strong>The</strong>orem<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
Given model–m(C) : C ∈ R k , <strong>and</strong> data O<br />
Bayes <strong>The</strong>orem: Prior belief × Likelihood → Posterior<br />
p(m)p(O|m)<br />
] = p(m|O). (1)<br />
[∫R<br />
p(O|m)p(m)dm k<br />
Posterior pdf used in inference, e.g., expectation of J: 〈J〉<br />
∫<br />
〈J〉 = J(m)p(m|O)dm = 1<br />
R k n<br />
≈ 1 n<br />
n<br />
∑<br />
j=1<br />
n<br />
∑<br />
j=1<br />
J(m j )p(m j |O)<br />
, (2)<br />
h(m j )<br />
J(m j ), for h(m j ) ≈ p(m j |O). (3)<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
MCMC <strong>Algorithm</strong>s<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
Avoid calculating (intractable) integral ∫ R k p(O|m)p(m)dm.<br />
Generate ensemble of models, m 1 , m 2 , . . . , m n |m j ≡ m(C j )<br />
Such that distribution of {m j } n j=1 ∼ h(m)<br />
h(m) ≈ p(m|O) ⇒ 〈J〉 is an average over m 1 , m 2 , . . . , m n<br />
A popular implementation – Metropolis–Hastings algorithm<br />
Some example drawbacks:<br />
long burn–in time<br />
slow convergence (especially in high dimensions)<br />
Recent developments attempt to address drawbacks<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Aim of Talk<br />
Present<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
A variant Monte Carlo algorithm<br />
Incorporates gradient information in distribution space<br />
Investigated strategies for improving performance<br />
Numerical experimental results<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
<strong>Algorithm</strong> Description–I<br />
Practical Implementation<br />
Type of Markov Chain <strong>Algorithm</strong><br />
Combines advantages of Hamiltonian dynamics &<br />
Metropolis MC<br />
Incorporates gradients in dynamic trajectories<br />
Given vector of parameters C ∈ R k ,<br />
Augment <strong>with</strong> conjugate momentum vector P ∈ R k<br />
Introduce function H(C, P), on phase–space (C, P).<br />
H(C, P) ≡ Hamiltonian function (Classical dynamics)<br />
H(C, P) = V(C) + K(P), (4)<br />
V(C) = − log π(C), K(P) = 1 2 |P|2 . (5)<br />
V, K, π(C) ≡ Pot. & Kinetic energies, Target distribution<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
<strong>Algorithm</strong> Description–II<br />
Practical Implementation<br />
If V(C) induces a Boltzmann distribution over C<br />
p(C) =<br />
e −V(C)<br />
∫<br />
R<br />
e −V(C) dC<br />
n<br />
(6)<br />
H(C, P) induces a similar distribution on (C, P),<br />
p(C, P) =<br />
e −H(C,P)<br />
∫R<br />
∫R<br />
e −H(C,P) = p(C)p(P), (7)<br />
dCdP n n<br />
p(P) = (2π) −n/2 e (− 1 2 |P|2) . (8)<br />
Simulate ergodic Markov chain <strong>with</strong> stationary distrib. ∼ (7)<br />
Estimate 〈J〉– use values of C from successive Markov<br />
chain states <strong>with</strong> marginal distribution given by (6)<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
<strong>Algorithm</strong> Description–III<br />
Practical Implementation<br />
Stochastic Transition<br />
Draw r<strong>and</strong>om variable P ∼ p(P) = (2π) −n/2 e (− 1 2 |P|2 )<br />
Dynamic Transition<br />
New pair of (C, P) ∼ p(C, P), starting from current C,<br />
Sample regions of constant H <strong>with</strong>out bias<br />
Ensures ergodicity of the Markov chain<br />
Dynamic transitions–governed by Hamiltonian equations<br />
dC<br />
dτ = + ∂H<br />
∂P = P, dP<br />
dτ = − ∂H = −∇V(C). (9)<br />
∂C<br />
Hamiltonian dynamic transitions satisfy<br />
Time reversibility (invariance under τ → −τ, P → −P),<br />
Conservation of energy (H(C, P) invariant <strong>with</strong> τ)<br />
Liouville’s theorem (conservation of phase–space volume).<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Practical Implementation<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Leapfrog <strong>HMC</strong><br />
Practical Implementation<br />
Choose chain length N & leapfrog steps L<br />
Simulate Hamiltonian dynamics <strong>with</strong> finite step size, ɛ.<br />
P(τ + ɛ 2 ) = P(τ) − ɛ ∇V(C(τ)), (10)<br />
2<br />
C(τ + ɛ) = C(τ) + ɛP(τ + ɛ ), (11)<br />
2<br />
P(τ + ɛ) = P(τ + ɛ 2 ) − ɛ ∇V(C(τ + ɛ)). (12)<br />
2<br />
Transition is volume–preserving <strong>and</strong> time–reversible<br />
Finite ɛ does not keep H constant → systematic error<br />
Elimate systematic error using a Metropolis rule<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
P<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Practical Implementation<br />
<strong>The</strong> <strong>Algorithm</strong>–Example Implementation<br />
<strong>Algorithm</strong><br />
Initialize C (0)<br />
for i = 1 to N − 1<br />
Sample u ∼ U [0,1] <strong>and</strong> P ∗ ∼ N(0, I)<br />
C 0 = C (i) <strong>and</strong> P 0 = P ∗ + ε 2 ∇V(C 0)<br />
For l = 1 to L<br />
P l = P l−1 − ε 2 ∇V(C l)<br />
C l = C l−1 +εP l−1<br />
P l = P l−1 − ε 2 ∇V(C l)<br />
end For<br />
dH = H(C L , P L ) − H(C (i) , P ∗ )<br />
if u < min{1, exp(−dH)}<br />
(C (i+1) , P (i+1) ) = (C L , P L )<br />
else<br />
(C (i+1) , P (i+1) ) = (C (i) , P (i) )<br />
end for<br />
return C = [C (1) , C (2) , ..., C (N−1) ]<br />
Example<br />
4<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
−4<br />
−4 −3 −2 −1 0 1 2 3 4<br />
C<br />
π(C) , Frequency<br />
600<br />
500<br />
400<br />
300<br />
200<br />
100<br />
Truth<br />
0<br />
−4 −3 −2 −1 0 1 2 3 4<br />
C , Bin Centers<br />
Phase-space <strong>and</strong> distribution plots for 2D correlated<br />
Gaussian distribution.<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Issues <strong>with</strong> Implementation<br />
Practical Implementation<br />
Example<br />
Given a chain of length N, the choices of L & ɛ are decisive.<br />
6 x 10−4 ∆ j<br />
5<br />
4<br />
3<br />
0.01<br />
0.005<br />
5<br />
4<br />
3<br />
δH<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
50 100 150 200 250 300 350 400 450 500<br />
δH<br />
0<br />
−0.005<br />
−0.01<br />
0 100 200 300 400 500<br />
∆ j<br />
δH<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
50 100 150 200 250 300 350 400 450 500<br />
∆ j<br />
(a) ɛ = 0.025, L = 20 (b) ɛ = 0.1, L = 20 (c) ɛ = 1.9, L = 30<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Effect of Gibbs Sampling<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
Momentum variable P ∼ Gibbs sampler → r<strong>and</strong>om walks<br />
Could lead to sub–optimal sampling of phase-space<br />
Doubling on movement leads to extra cost– CPU time<br />
Illustration<br />
3<br />
δπ(x 1<br />
,x 2<br />
)<br />
<strong>HMC</strong><br />
2<br />
2<br />
<strong>HMC</strong> points<br />
1.5<br />
1<br />
1<br />
0.5<br />
C 2<br />
0<br />
x 2<br />
0<br />
−0.5<br />
−1<br />
−1<br />
−2<br />
−1.5<br />
−2<br />
−3<br />
−3 −2 −1 0<br />
C 1<br />
1 2 3<br />
(a) <strong>HMC</strong> walker<br />
−2.5<br />
x 1<br />
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2<br />
(b) Ideal walker<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
f j<br />
f j<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Effect of Constant <strong>Step</strong>–size<br />
Example<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
For usual implementations, ɛ is constant<br />
Inefficient when trajectory dynamics vary in different<br />
phase–space regions<br />
Leads to extra cost– CPU time<br />
200<br />
150<br />
π(C)<br />
200<br />
150<br />
π(C)<br />
j<br />
0<br />
−4 −3 −2 −1 0 1 2 3 4<br />
c<br />
j<br />
0<br />
−4 −3 −2 −1 0 1 2 3 4<br />
c<br />
5 10 15 20 25 30 35 40 45 50<br />
10 −1 j<br />
100<br />
100<br />
ε j<br />
50<br />
50<br />
(a) Constant ɛ = 0.1, L = 30 (b) Variable ɛ, L = 30 (c) Variable ɛ j<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Investigate Two Approaches<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
Proposal 1: Suppressing r<strong>and</strong>om Walk in Gibbs sampling<br />
Ordered over-relaxation (R. Neal)<br />
Proposal 2: Using a variable step–size for dynamics<br />
Investigate a Runge–Kutta type integrator (simplectic)<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
Applying over–relaxation to P– Over-rel. <strong>HMC</strong> (O<strong>HMC</strong>)<br />
Ordered over-relaxation<br />
To over–relax R n ∋ P ∼ N (P; 0, I)<br />
For i = 1 : n<br />
Generated K values from<br />
N (q i |{q j } i̸=j ).<br />
Order K values <strong>and</strong> the odd value P i .<br />
q (0)<br />
i<br />
≤ · · · ≤ q (r)<br />
i<br />
= P i ≤ · · · ≤ q (K)<br />
i<br />
Set P ′ i = q(K−r) i<br />
.<br />
End for<br />
Example<br />
C 2<br />
C 2<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
δπ(x 1<br />
,x 2<br />
)<br />
O<strong>HMC</strong> points<br />
O<strong>HMC</strong><br />
−3<br />
−3 −2 −1 0 1 2 3<br />
C 1<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
δπ(x 1<br />
,x 2<br />
)<br />
<strong>HMC</strong> points<br />
<strong>HMC</strong><br />
−3<br />
−3 −2 −1 0 1 2 3<br />
C 1<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
Variable <strong>Step</strong>–Size <strong>HMC</strong> <strong>Algorithm</strong> (SV<strong>HMC</strong>)<br />
Explicit variable step–size using a Runge–Kutta scheme<br />
<strong>Adaptive</strong> Störmer–Verlet<br />
For l = 1 : L − steps<br />
End For<br />
C l+<br />
1 = C l + ɛ P<br />
2 2ρ l+<br />
1 ,<br />
l 2<br />
P l+<br />
1 = P l − ɛ ∇V(C<br />
2 2ρ l ),<br />
l<br />
ρ l+1 + ρ l = 2U(C l+<br />
1<br />
2<br />
P l+1 = P l+<br />
1 −<br />
2<br />
, P l+<br />
1 ),<br />
2<br />
ɛ<br />
2ρ l+1<br />
∇V(C l+1 ),<br />
C l+1 = C l+<br />
1 + ɛ P<br />
2 2ρ l+<br />
1 .<br />
n+1 2<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
<strong>Adaptive</strong> <strong>Step</strong>–size<br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
Example<br />
<strong>Adaptive</strong> Störmer–Verlet<br />
<strong>Adaptive</strong> ɛ reduces ∆H.<br />
Parameter ɛ depends on<br />
U(C, P) =<br />
√<br />
‖∇V(C)‖ 2 + P T [∇ 2 V(C)] 2 P<br />
log(∆H/H)<br />
leapfrog Scheme<br />
SV method<br />
10 5<br />
10 0<br />
10 −5<br />
10 −10<br />
0 500 1000 1500<br />
10 10 τ<br />
Observed ∼ theoretical<br />
acceptance rates<br />
ρ o is a fictive parameter<br />
Acceptance rate<br />
1.02<br />
1.01<br />
1<br />
0.99<br />
0.98<br />
0.97<br />
Observed<br />
<strong>The</strong>oretic<br />
0.96<br />
0.1 0.15 0.2 0.25 0.3<br />
ρ 0<br />
0.35 0.4 0.45 0.5<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Numerical Experiments<br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Gaussian targets <strong>with</strong> uncorrelated covariates in 64 &<br />
128D<br />
(<br />
1<br />
π(C) =<br />
exp − 1 )<br />
(2π) D 2 det(Σ) 1 2 2 CT Σ −1 C . (13)<br />
Compare <strong>HMC</strong>, SV<strong>HMC</strong> & OSV<strong>HMC</strong> algorithms based on<br />
Degree of chain autocorrelation<br />
Effective number of samples in a given chain<br />
Variance of sample means, C, of a finite chain<br />
Convergence rates/ratio<br />
Dimensionless efficiency,<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Evaluation Criteria<br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Suppose {c i } N i=1<br />
is chain generated by algorithm.<br />
1 Degree of correlation criteria<br />
Autocorrelation function ρ(l) = Cov(x i,x i+l )<br />
Var(x i )<br />
Integrated autocorrelation time τ int = 1 2 + ∑∞ t=1 ρ(t)<br />
Effective sample size N eff = N/(2τ int )<br />
2 Spectral analysis criteria<br />
Compute ˜P j = | ˜C(κ) ∗ ˜C(κ)|, ˜C(κ) = DFT(c)<br />
(κ ∗ /κ) α<br />
Fit template P(κ) = P 0 (κ ∗ /κ) α +1 to ˜P j<br />
α, P(0) & κ ∗ – parameters to be estimated<br />
<strong>The</strong> sample mean variance σ 2¯x ≈ P(κ = 0)/N<br />
Convergence ratio r = σ 2¯x /σ0<br />
2 σ0 <strong>The</strong> dimensionless efficiency E = lim 2/N<br />
N→∞<br />
σ2¯x (N)<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
−3<br />
10 10−2 10−1 100 101<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Evaluation criteria – Geometric Illustration<br />
Degree of correlation<br />
Spectral analysis<br />
Autocorrelation function<br />
0.9<br />
0.7<br />
0.5<br />
10 0<br />
ρ(τ)<br />
0.3<br />
P(κ)<br />
0.1<br />
10 −1<br />
−0.1<br />
−0.3<br />
−0.5<br />
0 1 2 3 4 5 6 7 8<br />
τ<br />
10 1 κ<br />
10−2<br />
−3<br />
10 10−2 10−1 100 101<br />
1.3<br />
Integrated autocorrelation function<br />
<strong>The</strong> curve turn over<br />
1.1<br />
ρ int<br />
(τ)<br />
0.9<br />
10 0<br />
10 1 κ<br />
P(0)<br />
P(κ)<br />
P(κ)≈ (κ) −α<br />
0.7<br />
0.5<br />
0.3<br />
0 1 2 3 4 5 6 7 8<br />
τ<br />
κ*<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Talk Outline<br />
Example results from the improved <strong>HMC</strong> algorithm<br />
1 Background<br />
Bayes <strong>The</strong>orem<br />
MCMC <strong>Algorithm</strong>s<br />
2 Aim of Talk<br />
3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Practical Implementation<br />
4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Improving Phase–Space Sampling<br />
Improvement Strategies<br />
5 Numerical Experiments & Results<br />
<strong>The</strong> improved <strong>HMC</strong> algorithm<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
1.9<br />
1.7<br />
1.5<br />
1.3<br />
1.1<br />
0.9<br />
0.7<br />
0.5<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Comparing O<strong>HMC</strong> vs <strong>HMC</strong><br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Gaussian Target<br />
1<br />
π(x) = exp ( − 1 (2π) n/2 |Σ| 1/2 2 xT Σ −1 x )<br />
Σ = I<br />
Results (n=64, N=2000)<br />
O<strong>HMC</strong> <strong>HMC</strong> Ideal<br />
Accept. rate 0.99 0.99 1<br />
P(0) 1.35 1.51 1<br />
κ ∗ 1.65 1.45<br />
CPU time[sec] 561.22 557.38<br />
E 0.74 0.66 1<br />
r 6.7e − 4 7.6e − 4 < 0.01<br />
τ int 1.63 1.85 0.5<br />
N eff 614 542 2000<br />
Graphical Illustration<br />
P(κ)<br />
ρ int<br />
(τ)<br />
3.1623<br />
2.5119<br />
1.9953<br />
1.5849<br />
1.2589<br />
1<br />
0.7943<br />
0.631<br />
0.5012<br />
0.3981<br />
0.3162<br />
10 −3 10 −2 10 −1 10 0 10 1<br />
κ<br />
O<strong>HMC</strong><br />
<strong>HMC</strong><br />
0.3<br />
0 2 4 6 8 10 12 14 16<br />
τ<br />
iacf O<strong>HMC</strong><br />
(τ*,τ* int<br />
) O<strong>HMC</strong><br />
iacf <strong>HMC</strong><br />
(τ*,τ* int<br />
) <strong>HMC</strong><br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
1.9<br />
1.7<br />
1.5<br />
1.3<br />
1.1<br />
0.9<br />
0.7<br />
0.5<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Comparing SV<strong>HMC</strong> vs <strong>HMC</strong><br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Graphical Illustration<br />
Numerical Results (n=128 N=2000)<br />
SV<strong>HMC</strong> <strong>HMC</strong> Ideal<br />
Accept. rate 0.92 0.98 1<br />
P(0) 1.09 3.13 1<br />
κ ∗ 3.15 1.55<br />
CPU time[sec] 1568.01 1117.78<br />
E 0.92 0.67 1<br />
r 5.6e − 4 7.4e − 4 < 0.01<br />
τ int 0.86 1.80 0.5<br />
N eff 1167 554 2000<br />
P(κ)<br />
ρ int<br />
(τ)<br />
3.1623<br />
2.5119<br />
1.9953<br />
1.5849<br />
1.2589<br />
1<br />
0.7943<br />
0.631<br />
0.5012<br />
0.3981<br />
0.3162<br />
10 −3 10 −2 10 −1 10 0 10 1<br />
κ<br />
SV<strong>HMC</strong><br />
<strong>HMC</strong><br />
iacf SV<strong>HMC</strong><br />
(τ*,τ* int<br />
) SV<strong>HMC</strong><br />
iacf <strong>HMC</strong><br />
(τ*,τ* int<br />
) <strong>HMC</strong><br />
0.3<br />
0 2 4 6 8 10 12 14 16<br />
τ<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
1.7<br />
1.5<br />
1.3<br />
1.1<br />
0.9<br />
0.7<br />
0.5<br />
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Comparing OSV<strong>HMC</strong> vs SV<strong>HMC</strong><br />
Example results from the improved <strong>HMC</strong> algorithm<br />
Graphical Illustration<br />
Numerical Results (n=64, N=2000)<br />
OSV<strong>HMC</strong> SV<strong>HMC</strong> Ideal<br />
Accept. rate 0.92 0.94 1<br />
P(0) 1.02 1.11 1<br />
κ ∗ 4.15 3.19<br />
CPU time[sec] 639.58 669.40<br />
E 0.98 0.90 1<br />
r 5.1e − 4 5.6e − 4 < 0.01<br />
τ int 0.71 0.94 0.5<br />
N eff 1400 1059 2000<br />
P(κ)<br />
ρ int<br />
(τ)<br />
3.1623<br />
2.5119<br />
1.9953<br />
1.5849<br />
1.2589<br />
1<br />
0.7943<br />
0.631<br />
0.5012<br />
0.3981<br />
0.3162<br />
10 −3 10 −2 10 −1 10 0 10 1<br />
κ<br />
OSV<strong>HMC</strong><br />
SV<strong>HMC</strong><br />
iacf OSV<strong>HMC</strong><br />
(τ,τ int<br />
) OSV<strong>HMC</strong><br />
iacf SV<strong>HMC</strong><br />
(τ*,τ* int<br />
) SV<strong>HMC</strong><br />
0.3<br />
0 1 2 3 4 5 6 7 8<br />
τ<br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>
Background<br />
Aim of Talk<br />
<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />
Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />
Numerical Experiments & Results<br />
Summary <strong>and</strong> Conclusion<br />
Example results from the improved <strong>HMC</strong> algorithm<br />
1 Over-relaxation in the Gibbs sampling improves<br />
dimensionless efficiency by a factor ∼ 12%.<br />
E O<strong>HMC</strong><br />
E <strong>HMC</strong><br />
≈ 1.2<br />
2 Using . Störmer–Verlet discretization outperforms the<br />
leapfrog <strong>HMC</strong> by having ∼ 50% more effective sample size<br />
N SV<br />
eff<br />
N leapfrog<br />
eff<br />
≈ 2.0<br />
3 <strong>The</strong> hybrid– OSV<strong>HMC</strong> (over-relaxing the momentum &<br />
<strong>Adaptive</strong> ɛ) outperform the SV<strong>HMC</strong><br />
M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>