22.12.2013 Views

The HMC Algorithm with Overrelaxation and Adaptive--Step ...

The HMC Algorithm with Overrelaxation and Adaptive--Step ...

The HMC Algorithm with Overrelaxation and Adaptive--Step ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>The</strong> <strong>HMC</strong> <strong>Algorithm</strong> <strong>with</strong> <strong>Overrelaxation</strong> <strong>and</strong><br />

<strong>Adaptive</strong>–<strong>Step</strong> Discretization<br />

Numerical Experiments <strong>with</strong> Gaussian Targets<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong><br />

Thiele Conference<br />

July 17, 2008<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Bayes <strong>The</strong>orem<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

Given model–m(C) : C ∈ R k , <strong>and</strong> data O<br />

Bayes <strong>The</strong>orem: Prior belief × Likelihood → Posterior<br />

p(m)p(O|m)<br />

] = p(m|O). (1)<br />

[∫R<br />

p(O|m)p(m)dm k<br />

Posterior pdf used in inference, e.g., expectation of J: 〈J〉<br />

∫<br />

〈J〉 = J(m)p(m|O)dm = 1<br />

R k n<br />

≈ 1 n<br />

n<br />

∑<br />

j=1<br />

n<br />

∑<br />

j=1<br />

J(m j )p(m j |O)<br />

, (2)<br />

h(m j )<br />

J(m j ), for h(m j ) ≈ p(m j |O). (3)<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

MCMC <strong>Algorithm</strong>s<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

Avoid calculating (intractable) integral ∫ R k p(O|m)p(m)dm.<br />

Generate ensemble of models, m 1 , m 2 , . . . , m n |m j ≡ m(C j )<br />

Such that distribution of {m j } n j=1 ∼ h(m)<br />

h(m) ≈ p(m|O) ⇒ 〈J〉 is an average over m 1 , m 2 , . . . , m n<br />

A popular implementation – Metropolis–Hastings algorithm<br />

Some example drawbacks:<br />

long burn–in time<br />

slow convergence (especially in high dimensions)<br />

Recent developments attempt to address drawbacks<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Aim of Talk<br />

Present<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

A variant Monte Carlo algorithm<br />

Incorporates gradient information in distribution space<br />

Investigated strategies for improving performance<br />

Numerical experimental results<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

<strong>Algorithm</strong> Description–I<br />

Practical Implementation<br />

Type of Markov Chain <strong>Algorithm</strong><br />

Combines advantages of Hamiltonian dynamics &<br />

Metropolis MC<br />

Incorporates gradients in dynamic trajectories<br />

Given vector of parameters C ∈ R k ,<br />

Augment <strong>with</strong> conjugate momentum vector P ∈ R k<br />

Introduce function H(C, P), on phase–space (C, P).<br />

H(C, P) ≡ Hamiltonian function (Classical dynamics)<br />

H(C, P) = V(C) + K(P), (4)<br />

V(C) = − log π(C), K(P) = 1 2 |P|2 . (5)<br />

V, K, π(C) ≡ Pot. & Kinetic energies, Target distribution<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

<strong>Algorithm</strong> Description–II<br />

Practical Implementation<br />

If V(C) induces a Boltzmann distribution over C<br />

p(C) =<br />

e −V(C)<br />

∫<br />

R<br />

e −V(C) dC<br />

n<br />

(6)<br />

H(C, P) induces a similar distribution on (C, P),<br />

p(C, P) =<br />

e −H(C,P)<br />

∫R<br />

∫R<br />

e −H(C,P) = p(C)p(P), (7)<br />

dCdP n n<br />

p(P) = (2π) −n/2 e (− 1 2 |P|2) . (8)<br />

Simulate ergodic Markov chain <strong>with</strong> stationary distrib. ∼ (7)<br />

Estimate 〈J〉– use values of C from successive Markov<br />

chain states <strong>with</strong> marginal distribution given by (6)<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

<strong>Algorithm</strong> Description–III<br />

Practical Implementation<br />

Stochastic Transition<br />

Draw r<strong>and</strong>om variable P ∼ p(P) = (2π) −n/2 e (− 1 2 |P|2 )<br />

Dynamic Transition<br />

New pair of (C, P) ∼ p(C, P), starting from current C,<br />

Sample regions of constant H <strong>with</strong>out bias<br />

Ensures ergodicity of the Markov chain<br />

Dynamic transitions–governed by Hamiltonian equations<br />

dC<br />

dτ = + ∂H<br />

∂P = P, dP<br />

dτ = − ∂H = −∇V(C). (9)<br />

∂C<br />

Hamiltonian dynamic transitions satisfy<br />

Time reversibility (invariance under τ → −τ, P → −P),<br />

Conservation of energy (H(C, P) invariant <strong>with</strong> τ)<br />

Liouville’s theorem (conservation of phase–space volume).<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Practical Implementation<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Leapfrog <strong>HMC</strong><br />

Practical Implementation<br />

Choose chain length N & leapfrog steps L<br />

Simulate Hamiltonian dynamics <strong>with</strong> finite step size, ɛ.<br />

P(τ + ɛ 2 ) = P(τ) − ɛ ∇V(C(τ)), (10)<br />

2<br />

C(τ + ɛ) = C(τ) + ɛP(τ + ɛ ), (11)<br />

2<br />

P(τ + ɛ) = P(τ + ɛ 2 ) − ɛ ∇V(C(τ + ɛ)). (12)<br />

2<br />

Transition is volume–preserving <strong>and</strong> time–reversible<br />

Finite ɛ does not keep H constant → systematic error<br />

Elimate systematic error using a Metropolis rule<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


P<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Practical Implementation<br />

<strong>The</strong> <strong>Algorithm</strong>–Example Implementation<br />

<strong>Algorithm</strong><br />

Initialize C (0)<br />

for i = 1 to N − 1<br />

Sample u ∼ U [0,1] <strong>and</strong> P ∗ ∼ N(0, I)<br />

C 0 = C (i) <strong>and</strong> P 0 = P ∗ + ε 2 ∇V(C 0)<br />

For l = 1 to L<br />

P l = P l−1 − ε 2 ∇V(C l)<br />

C l = C l−1 +εP l−1<br />

P l = P l−1 − ε 2 ∇V(C l)<br />

end For<br />

dH = H(C L , P L ) − H(C (i) , P ∗ )<br />

if u < min{1, exp(−dH)}<br />

(C (i+1) , P (i+1) ) = (C L , P L )<br />

else<br />

(C (i+1) , P (i+1) ) = (C (i) , P (i) )<br />

end for<br />

return C = [C (1) , C (2) , ..., C (N−1) ]<br />

Example<br />

4<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

−3<br />

−4<br />

−4 −3 −2 −1 0 1 2 3 4<br />

C<br />

π(C) , Frequency<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

Truth<br />

0<br />

−4 −3 −2 −1 0 1 2 3 4<br />

C , Bin Centers<br />

Phase-space <strong>and</strong> distribution plots for 2D correlated<br />

Gaussian distribution.<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Issues <strong>with</strong> Implementation<br />

Practical Implementation<br />

Example<br />

Given a chain of length N, the choices of L & ɛ are decisive.<br />

6 x 10−4 ∆ j<br />

5<br />

4<br />

3<br />

0.01<br />

0.005<br />

5<br />

4<br />

3<br />

δH<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

−3<br />

50 100 150 200 250 300 350 400 450 500<br />

δH<br />

0<br />

−0.005<br />

−0.01<br />

0 100 200 300 400 500<br />

∆ j<br />

δH<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

50 100 150 200 250 300 350 400 450 500<br />

∆ j<br />

(a) ɛ = 0.025, L = 20 (b) ɛ = 0.1, L = 20 (c) ɛ = 1.9, L = 30<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Effect of Gibbs Sampling<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

Momentum variable P ∼ Gibbs sampler → r<strong>and</strong>om walks<br />

Could lead to sub–optimal sampling of phase-space<br />

Doubling on movement leads to extra cost– CPU time<br />

Illustration<br />

3<br />

δπ(x 1<br />

,x 2<br />

)<br />

<strong>HMC</strong><br />

2<br />

2<br />

<strong>HMC</strong> points<br />

1.5<br />

1<br />

1<br />

0.5<br />

C 2<br />

0<br />

x 2<br />

0<br />

−0.5<br />

−1<br />

−1<br />

−2<br />

−1.5<br />

−2<br />

−3<br />

−3 −2 −1 0<br />

C 1<br />

1 2 3<br />

(a) <strong>HMC</strong> walker<br />

−2.5<br />

x 1<br />

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2<br />

(b) Ideal walker<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


f j<br />

f j<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Effect of Constant <strong>Step</strong>–size<br />

Example<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

For usual implementations, ɛ is constant<br />

Inefficient when trajectory dynamics vary in different<br />

phase–space regions<br />

Leads to extra cost– CPU time<br />

200<br />

150<br />

π(C)<br />

200<br />

150<br />

π(C)<br />

j<br />

0<br />

−4 −3 −2 −1 0 1 2 3 4<br />

c<br />

j<br />

0<br />

−4 −3 −2 −1 0 1 2 3 4<br />

c<br />

5 10 15 20 25 30 35 40 45 50<br />

10 −1 j<br />

100<br />

100<br />

ε j<br />

50<br />

50<br />

(a) Constant ɛ = 0.1, L = 30 (b) Variable ɛ, L = 30 (c) Variable ɛ j<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Investigate Two Approaches<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

Proposal 1: Suppressing r<strong>and</strong>om Walk in Gibbs sampling<br />

Ordered over-relaxation (R. Neal)<br />

Proposal 2: Using a variable step–size for dynamics<br />

Investigate a Runge–Kutta type integrator (simplectic)<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

Applying over–relaxation to P– Over-rel. <strong>HMC</strong> (O<strong>HMC</strong>)<br />

Ordered over-relaxation<br />

To over–relax R n ∋ P ∼ N (P; 0, I)<br />

For i = 1 : n<br />

Generated K values from<br />

N (q i |{q j } i̸=j ).<br />

Order K values <strong>and</strong> the odd value P i .<br />

q (0)<br />

i<br />

≤ · · · ≤ q (r)<br />

i<br />

= P i ≤ · · · ≤ q (K)<br />

i<br />

Set P ′ i = q(K−r) i<br />

.<br />

End for<br />

Example<br />

C 2<br />

C 2<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

δπ(x 1<br />

,x 2<br />

)<br />

O<strong>HMC</strong> points<br />

O<strong>HMC</strong><br />

−3<br />

−3 −2 −1 0 1 2 3<br />

C 1<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

δπ(x 1<br />

,x 2<br />

)<br />

<strong>HMC</strong> points<br />

<strong>HMC</strong><br />

−3<br />

−3 −2 −1 0 1 2 3<br />

C 1<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

Variable <strong>Step</strong>–Size <strong>HMC</strong> <strong>Algorithm</strong> (SV<strong>HMC</strong>)<br />

Explicit variable step–size using a Runge–Kutta scheme<br />

<strong>Adaptive</strong> Störmer–Verlet<br />

For l = 1 : L − steps<br />

End For<br />

C l+<br />

1 = C l + ɛ P<br />

2 2ρ l+<br />

1 ,<br />

l 2<br />

P l+<br />

1 = P l − ɛ ∇V(C<br />

2 2ρ l ),<br />

l<br />

ρ l+1 + ρ l = 2U(C l+<br />

1<br />

2<br />

P l+1 = P l+<br />

1 −<br />

2<br />

, P l+<br />

1 ),<br />

2<br />

ɛ<br />

2ρ l+1<br />

∇V(C l+1 ),<br />

C l+1 = C l+<br />

1 + ɛ P<br />

2 2ρ l+<br />

1 .<br />

n+1 2<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

<strong>Adaptive</strong> <strong>Step</strong>–size<br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

Example<br />

<strong>Adaptive</strong> Störmer–Verlet<br />

<strong>Adaptive</strong> ɛ reduces ∆H.<br />

Parameter ɛ depends on<br />

U(C, P) =<br />

√<br />

‖∇V(C)‖ 2 + P T [∇ 2 V(C)] 2 P<br />

log(∆H/H)<br />

leapfrog Scheme<br />

SV method<br />

10 5<br />

10 0<br />

10 −5<br />

10 −10<br />

0 500 1000 1500<br />

10 10 τ<br />

Observed ∼ theoretical<br />

acceptance rates<br />

ρ o is a fictive parameter<br />

Acceptance rate<br />

1.02<br />

1.01<br />

1<br />

0.99<br />

0.98<br />

0.97<br />

Observed<br />

<strong>The</strong>oretic<br />

0.96<br />

0.1 0.15 0.2 0.25 0.3<br />

ρ 0<br />

0.35 0.4 0.45 0.5<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Numerical Experiments<br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Gaussian targets <strong>with</strong> uncorrelated covariates in 64 &<br />

128D<br />

(<br />

1<br />

π(C) =<br />

exp − 1 )<br />

(2π) D 2 det(Σ) 1 2 2 CT Σ −1 C . (13)<br />

Compare <strong>HMC</strong>, SV<strong>HMC</strong> & OSV<strong>HMC</strong> algorithms based on<br />

Degree of chain autocorrelation<br />

Effective number of samples in a given chain<br />

Variance of sample means, C, of a finite chain<br />

Convergence rates/ratio<br />

Dimensionless efficiency,<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Evaluation Criteria<br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Suppose {c i } N i=1<br />

is chain generated by algorithm.<br />

1 Degree of correlation criteria<br />

Autocorrelation function ρ(l) = Cov(x i,x i+l )<br />

Var(x i )<br />

Integrated autocorrelation time τ int = 1 2 + ∑∞ t=1 ρ(t)<br />

Effective sample size N eff = N/(2τ int )<br />

2 Spectral analysis criteria<br />

Compute ˜P j = | ˜C(κ) ∗ ˜C(κ)|, ˜C(κ) = DFT(c)<br />

(κ ∗ /κ) α<br />

Fit template P(κ) = P 0 (κ ∗ /κ) α +1 to ˜P j<br />

α, P(0) & κ ∗ – parameters to be estimated<br />

<strong>The</strong> sample mean variance σ 2¯x ≈ P(κ = 0)/N<br />

Convergence ratio r = σ 2¯x /σ0<br />

2 σ0 <strong>The</strong> dimensionless efficiency E = lim 2/N<br />

N→∞<br />

σ2¯x (N)<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


−3<br />

10 10−2 10−1 100 101<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Evaluation criteria – Geometric Illustration<br />

Degree of correlation<br />

Spectral analysis<br />

Autocorrelation function<br />

0.9<br />

0.7<br />

0.5<br />

10 0<br />

ρ(τ)<br />

0.3<br />

P(κ)<br />

0.1<br />

10 −1<br />

−0.1<br />

−0.3<br />

−0.5<br />

0 1 2 3 4 5 6 7 8<br />

τ<br />

10 1 κ<br />

10−2<br />

−3<br />

10 10−2 10−1 100 101<br />

1.3<br />

Integrated autocorrelation function<br />

<strong>The</strong> curve turn over<br />

1.1<br />

ρ int<br />

(τ)<br />

0.9<br />

10 0<br />

10 1 κ<br />

P(0)<br />

P(κ)<br />

P(κ)≈ (κ) −α<br />

0.7<br />

0.5<br />

0.3<br />

0 1 2 3 4 5 6 7 8<br />

τ<br />

κ*<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Talk Outline<br />

Example results from the improved <strong>HMC</strong> algorithm<br />

1 Background<br />

Bayes <strong>The</strong>orem<br />

MCMC <strong>Algorithm</strong>s<br />

2 Aim of Talk<br />

3 <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Practical Implementation<br />

4 Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Improving Phase–Space Sampling<br />

Improvement Strategies<br />

5 Numerical Experiments & Results<br />

<strong>The</strong> improved <strong>HMC</strong> algorithm<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


1.9<br />

1.7<br />

1.5<br />

1.3<br />

1.1<br />

0.9<br />

0.7<br />

0.5<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Comparing O<strong>HMC</strong> vs <strong>HMC</strong><br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Gaussian Target<br />

1<br />

π(x) = exp ( − 1 (2π) n/2 |Σ| 1/2 2 xT Σ −1 x )<br />

Σ = I<br />

Results (n=64, N=2000)<br />

O<strong>HMC</strong> <strong>HMC</strong> Ideal<br />

Accept. rate 0.99 0.99 1<br />

P(0) 1.35 1.51 1<br />

κ ∗ 1.65 1.45<br />

CPU time[sec] 561.22 557.38<br />

E 0.74 0.66 1<br />

r 6.7e − 4 7.6e − 4 < 0.01<br />

τ int 1.63 1.85 0.5<br />

N eff 614 542 2000<br />

Graphical Illustration<br />

P(κ)<br />

ρ int<br />

(τ)<br />

3.1623<br />

2.5119<br />

1.9953<br />

1.5849<br />

1.2589<br />

1<br />

0.7943<br />

0.631<br />

0.5012<br />

0.3981<br />

0.3162<br />

10 −3 10 −2 10 −1 10 0 10 1<br />

κ<br />

O<strong>HMC</strong><br />

<strong>HMC</strong><br />

0.3<br />

0 2 4 6 8 10 12 14 16<br />

τ<br />

iacf O<strong>HMC</strong><br />

(τ*,τ* int<br />

) O<strong>HMC</strong><br />

iacf <strong>HMC</strong><br />

(τ*,τ* int<br />

) <strong>HMC</strong><br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


1.9<br />

1.7<br />

1.5<br />

1.3<br />

1.1<br />

0.9<br />

0.7<br />

0.5<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Comparing SV<strong>HMC</strong> vs <strong>HMC</strong><br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Graphical Illustration<br />

Numerical Results (n=128 N=2000)<br />

SV<strong>HMC</strong> <strong>HMC</strong> Ideal<br />

Accept. rate 0.92 0.98 1<br />

P(0) 1.09 3.13 1<br />

κ ∗ 3.15 1.55<br />

CPU time[sec] 1568.01 1117.78<br />

E 0.92 0.67 1<br />

r 5.6e − 4 7.4e − 4 < 0.01<br />

τ int 0.86 1.80 0.5<br />

N eff 1167 554 2000<br />

P(κ)<br />

ρ int<br />

(τ)<br />

3.1623<br />

2.5119<br />

1.9953<br />

1.5849<br />

1.2589<br />

1<br />

0.7943<br />

0.631<br />

0.5012<br />

0.3981<br />

0.3162<br />

10 −3 10 −2 10 −1 10 0 10 1<br />

κ<br />

SV<strong>HMC</strong><br />

<strong>HMC</strong><br />

iacf SV<strong>HMC</strong><br />

(τ*,τ* int<br />

) SV<strong>HMC</strong><br />

iacf <strong>HMC</strong><br />

(τ*,τ* int<br />

) <strong>HMC</strong><br />

0.3<br />

0 2 4 6 8 10 12 14 16<br />

τ<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


1.7<br />

1.5<br />

1.3<br />

1.1<br />

0.9<br />

0.7<br />

0.5<br />

Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Comparing OSV<strong>HMC</strong> vs SV<strong>HMC</strong><br />

Example results from the improved <strong>HMC</strong> algorithm<br />

Graphical Illustration<br />

Numerical Results (n=64, N=2000)<br />

OSV<strong>HMC</strong> SV<strong>HMC</strong> Ideal<br />

Accept. rate 0.92 0.94 1<br />

P(0) 1.02 1.11 1<br />

κ ∗ 4.15 3.19<br />

CPU time[sec] 639.58 669.40<br />

E 0.98 0.90 1<br />

r 5.1e − 4 5.6e − 4 < 0.01<br />

τ int 0.71 0.94 0.5<br />

N eff 1400 1059 2000<br />

P(κ)<br />

ρ int<br />

(τ)<br />

3.1623<br />

2.5119<br />

1.9953<br />

1.5849<br />

1.2589<br />

1<br />

0.7943<br />

0.631<br />

0.5012<br />

0.3981<br />

0.3162<br />

10 −3 10 −2 10 −1 10 0 10 1<br />

κ<br />

OSV<strong>HMC</strong><br />

SV<strong>HMC</strong><br />

iacf OSV<strong>HMC</strong><br />

(τ,τ int<br />

) OSV<strong>HMC</strong><br />

iacf SV<strong>HMC</strong><br />

(τ*,τ* int<br />

) SV<strong>HMC</strong><br />

0.3<br />

0 1 2 3 4 5 6 7 8<br />

τ<br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>


Background<br />

Aim of Talk<br />

<strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong><br />

Improving Performance of <strong>HMC</strong> <strong>Algorithm</strong><br />

Numerical Experiments & Results<br />

Summary <strong>and</strong> Conclusion<br />

Example results from the improved <strong>HMC</strong> algorithm<br />

1 Over-relaxation in the Gibbs sampling improves<br />

dimensionless efficiency by a factor ∼ 12%.<br />

E O<strong>HMC</strong><br />

E <strong>HMC</strong><br />

≈ 1.2<br />

2 Using . Störmer–Verlet discretization outperforms the<br />

leapfrog <strong>HMC</strong> by having ∼ 50% more effective sample size<br />

N SV<br />

eff<br />

N leapfrog<br />

eff<br />

≈ 2.0<br />

3 <strong>The</strong> hybrid– OSV<strong>HMC</strong> (over-relaxing the momentum &<br />

<strong>Adaptive</strong> ɛ) outperform the SV<strong>HMC</strong><br />

M. Alfaki, S. Subbey, <strong>and</strong> D. Haugl<strong>and</strong> <strong>The</strong> Hamiltonian Monte Carlo (<strong>HMC</strong>) <strong>Algorithm</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!