final report - probability.ca
final report - probability.ca
final report - probability.ca
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
takes the form σ 2 (d) = l 2 /d α , where α is the smallest number satisfying<br />
lim<br />
d→∞<br />
d λ1<br />
d γi c (J (i, d))<br />
< ∞ and lim<br />
dα d→∞ d α < ∞, (2.3)<br />
for i = 1, . . . , m. For this to hold, it follows that at least one of the m + 1 limits above must converge to some<br />
positive constant, and the rest converge to zero. Since the s<strong>ca</strong>ling term of the component of interest is taken to be<br />
equal to 1, the largest possible form for the proposal s<strong>ca</strong>ling is σ 2 = l 2 , and thus it will not diverge as d → ∞.<br />
Note 2.4. When we proceed to prove the convergence of the limiting process of the Metropolis algorithm to a<br />
Langevin diffusion, it will be useful to consider the recipro<strong>ca</strong>l of equation (2.3). This equivalent condition is<br />
c (J (i, d))<br />
(<br />
K 1 d<br />
λ 1<br />
lim + · · · + dλn<br />
+ · · ·<br />
d→∞ d λ1 K 1 K n<br />
d γ1<br />
K n+1<br />
+ · · · + c (J (m, d))<br />
)<br />
d γm<br />
K n+m<br />
= ∞. (2.4)<br />
Since all the distinct terms in the first n positions appear only a finite number of times, it follows that for (2.4) to<br />
K<br />
be satisfied, there must exist at least one i ∈ {1, . . . , m} such that lim 1 d→∞ d c (J (i, d)) dγ i<br />
λ 1 K n+i<br />
= ∞. Furthermore,<br />
d<br />
this stipulates that lim λ 1<br />
d→∞ d<br />
= 0 and hence the s<strong>ca</strong>ling term θ −2<br />
α<br />
1 (d) is immaterial for the choice of α.<br />
As was the <strong>ca</strong>se in the papers above, a time-accelerating adjustment is necessary in order to ensure the limiting<br />
process of the Metropolis algorithm is non-trivial. For this purpose, a new sped-up continuous ( time version of the<br />
original algorithm is introduced, paramterized to make jumps every d −α time units, i.e., Z d (t)= X (d)<br />
1 ([d α t]) , . . . , X (d)<br />
d<br />
where [·] is the integer function. As was the <strong>ca</strong>se earlier, the only way to preserve the Markov property of the discretetime<br />
process of the Metropolis algorithm while making it a continuous process is to resort to the memoryless property<br />
of the exponential distribution. By allowing the process to jump according to a Poisson process with rate d α , the<br />
new sped up process Z d (t) moves about d α times in every time unit, and furthermore, behaves asymptoti<strong>ca</strong>lly like<br />
a continuous Markov process.<br />
2.3.3 Optimal proposal and s<strong>ca</strong>ling values<br />
As has been noted in the preceding paragraphs, the major difference between the target (2.2) and the targets<br />
considered earlier is the inclusion of the various s<strong>ca</strong>ling terms. Since the preceding sections proved results for<br />
targets where the s<strong>ca</strong>ling terms are all identi<strong>ca</strong>l or identi<strong>ca</strong>lly distributed, the question that is of interest is how<br />
big a discrepancy between these various s<strong>ca</strong>ling terms is required in order to effect the limiting behavior of the<br />
algorithm. The results following show that if none of the first n s<strong>ca</strong>ling terms are signifi<strong>ca</strong>ntly different than the<br />
remaining s<strong>ca</strong>ling terms, then the limiting behavior is unaffected.<br />
Theorem 2.5. (Theorem 1 in [Béd07]) Consider a RWM algorithm with proposals Y (d) (t + 1) ∼ N ( 0, l 2 I d /d α) ,<br />
with α satisfying (2.3), applied to a target distribution of the form (2.2), which satisfies the specified conditions on<br />
f. Consider the i ∗th {<br />
component of the process Z (d)} { } { }<br />
i.e., Z<br />
t≥0, (d)<br />
i (t) = X (d)<br />
∗ i ∗ ([dα t]) , and let the<br />
algorithm{<br />
start in } stationarity (i.e., X (d) (0) ∼ π).<br />
Then Z (d)<br />
i (t) ⇒ {Z (t)} ∗ t≥0<br />
, where Z (0) is distributed according to the density f and {Z (t)} t≥0<br />
satisfies<br />
t≥0<br />
the Langevin SDE<br />
dZ (t) = v (l) 1 2<br />
h (l) ∇ log f (Z (t))<br />
dB t + dt,<br />
2<br />
if and only if<br />
d<br />
lim ∑ λ1<br />
n<br />
d→∞<br />
j=1 dλj + ∑ m<br />
i=1 c (J (i, d)) = 0. (2.5)<br />
dγi<br />
Here h (l) = 2l 2 Φ ( −l √ E R /2 ) , and<br />
E R = lim<br />
m∑<br />
d→∞<br />
i=1<br />
c (J (i, d))<br />
d α<br />
d γi<br />
K n+1<br />
E<br />
t≥0<br />
[ (f ′ ) ] 2<br />
(X)<br />
.<br />
f (X)<br />
Furthermore, we have lim d→∞ a (d, l) = 2Φ ( −l √ E R /2 ) ≡ a (l). The speed measure v (l) is maximized at the<br />
unique value ˆl = 2.38/ √ )<br />
E R , where a<br />
(ˆl = 0.234.<br />
t≥0<br />
)<br />
([d α t]) ,<br />
18