Lecture Notes - Department of Mathematics and Statistics - Queen's ...
Lecture Notes - Department of Mathematics and Statistics - Queen's ...
Lecture Notes - Department of Mathematics and Statistics - Queen's ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
22 CHAPTER 3. CLASSIFICATION OF MARKOV CHAINS<br />
Now,<br />
||πP − π ′ P || = ||ψP − ψ ′ P ||<br />
= ∑ | ∑ ψ(i)P(i, j) − ∑<br />
j i<br />
k<br />
ψ ′ (k)P(k, j)|<br />
= 1 ∑<br />
||ψ ′ | ∑ ∑<br />
ψ(i)ψ ′ (k)P(i, j) − ψ(i)ψ ′ (k)P(k, j)| (3.5)<br />
|| 1<br />
j k i<br />
≤ 1 ∑∑<br />
∑<br />
||ψ ′ ψ(i)ψ ′ (k)|P(i, j) − P(k, j)| (3.6)<br />
|| 1<br />
j<br />
k<br />
i<br />
= 1 ∑∑<br />
||ψ ′ ψ(i)ψ ′ (k) ∑ |P(i, j) − P(k, j)|<br />
|| 1<br />
k i<br />
j<br />
= 1 ∑∑<br />
||ψ ′ |ψ(i)||ψ ′ (k)|{ ∑ P(i, j) + P(k, j) − 2 min(P(i, j), P(k, j))}<br />
|| 1<br />
k i<br />
j<br />
(3.7)<br />
≤ 1 ∑∑<br />
||ψ ′ |ψ(i)||ψ ′ (k)|(2 − 2δ(P))<br />
|| 1<br />
(3.8)<br />
k<br />
i<br />
= ||ψ ′ || 1 (2 − 2δ(P)) (3.9)<br />
= ||π − π ′ || 1 (1 − δ(P))} (3.10)<br />
In the above, (3.5) follows from adding terms in the summation, (3.6) from taking the norm inside, (3.7) follows<br />
from the relation ||a − b|| = a + b − 2 min(a, b), (3.8) from the definition <strong>of</strong> δ(P) <strong>and</strong> finally (3.9) follows from<br />
the l 1 norms <strong>of</strong> ψ, ψ ′ .<br />
As such, the process P is a contraction mapping if δ(P) > 0. In essence, one proves that such a sequence is<br />
Cauchy, <strong>and</strong> as every Cauchy sequence in a Banach space has a limit, this process also has a limit. In our setting<br />
{π 0 P m } is a Cauchy sequence. The limit is the invariant distribution. ⊓⊔<br />
Dobrushin’s ergodic theorem provides a guarantee that the limit is unique.<br />
⊓⊔<br />
It should also be noted that Dobrushin’s theorem tells us how fast the sequence <strong>of</strong> probability distributions<br />
{π 0 P n } converges to the invariant distribution for any arbitrary π 0 .<br />
Ergodic Theorem for Countable State Space Chains<br />
For a Markov chain which has a unique invariant distribution µ(i), we have that almost surely<br />
1<br />
lim<br />
T →∞ T<br />
T∑<br />
t=1<br />
f(x t ) = ∑ i<br />
f(i)µ(i)<br />
∀ f : X → R.<br />
This is called the ergodic theorem due to Birkh<strong>of</strong>f <strong>and</strong> is a very powerful result. This is a very important theorem,<br />
because, in essence, this property is what makes the connection with stochastic control, <strong>and</strong> Markov chains in a<br />
long time horizon. In particular, for a stationary control policy leading to a unique invariant distribution with<br />
bounded costs, it follows that, almost surely,<br />
1<br />
lim<br />
T →∞ T<br />
T∑<br />
c(x t , u t ) = ∑ c(x, u)µ(x, u),<br />
x,u<br />
t=1<br />
∀ real-valued bounded c. The ergodic theorem is what makes a dynamic optimization problem equivalent to a<br />
static optimization problem under mild technical conditions. This will set the core ideas in the convex analytic<br />
approach <strong>and</strong> the linear programming approach that will be discussed later.