28.04.2014 Views

Lecture Notes - Department of Mathematics and Statistics - Queen's ...

Lecture Notes - Department of Mathematics and Statistics - Queen's ...

Lecture Notes - Department of Mathematics and Statistics - Queen's ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

22 CHAPTER 3. CLASSIFICATION OF MARKOV CHAINS<br />

Now,<br />

||πP − π ′ P || = ||ψP − ψ ′ P ||<br />

= ∑ | ∑ ψ(i)P(i, j) − ∑<br />

j i<br />

k<br />

ψ ′ (k)P(k, j)|<br />

= 1 ∑<br />

||ψ ′ | ∑ ∑<br />

ψ(i)ψ ′ (k)P(i, j) − ψ(i)ψ ′ (k)P(k, j)| (3.5)<br />

|| 1<br />

j k i<br />

≤ 1 ∑∑<br />

∑<br />

||ψ ′ ψ(i)ψ ′ (k)|P(i, j) − P(k, j)| (3.6)<br />

|| 1<br />

j<br />

k<br />

i<br />

= 1 ∑∑<br />

||ψ ′ ψ(i)ψ ′ (k) ∑ |P(i, j) − P(k, j)|<br />

|| 1<br />

k i<br />

j<br />

= 1 ∑∑<br />

||ψ ′ |ψ(i)||ψ ′ (k)|{ ∑ P(i, j) + P(k, j) − 2 min(P(i, j), P(k, j))}<br />

|| 1<br />

k i<br />

j<br />

(3.7)<br />

≤ 1 ∑∑<br />

||ψ ′ |ψ(i)||ψ ′ (k)|(2 − 2δ(P))<br />

|| 1<br />

(3.8)<br />

k<br />

i<br />

= ||ψ ′ || 1 (2 − 2δ(P)) (3.9)<br />

= ||π − π ′ || 1 (1 − δ(P))} (3.10)<br />

In the above, (3.5) follows from adding terms in the summation, (3.6) from taking the norm inside, (3.7) follows<br />

from the relation ||a − b|| = a + b − 2 min(a, b), (3.8) from the definition <strong>of</strong> δ(P) <strong>and</strong> finally (3.9) follows from<br />

the l 1 norms <strong>of</strong> ψ, ψ ′ .<br />

As such, the process P is a contraction mapping if δ(P) > 0. In essence, one proves that such a sequence is<br />

Cauchy, <strong>and</strong> as every Cauchy sequence in a Banach space has a limit, this process also has a limit. In our setting<br />

{π 0 P m } is a Cauchy sequence. The limit is the invariant distribution. ⊓⊔<br />

Dobrushin’s ergodic theorem provides a guarantee that the limit is unique.<br />

⊓⊔<br />

It should also be noted that Dobrushin’s theorem tells us how fast the sequence <strong>of</strong> probability distributions<br />

{π 0 P n } converges to the invariant distribution for any arbitrary π 0 .<br />

Ergodic Theorem for Countable State Space Chains<br />

For a Markov chain which has a unique invariant distribution µ(i), we have that almost surely<br />

1<br />

lim<br />

T →∞ T<br />

T∑<br />

t=1<br />

f(x t ) = ∑ i<br />

f(i)µ(i)<br />

∀ f : X → R.<br />

This is called the ergodic theorem due to Birkh<strong>of</strong>f <strong>and</strong> is a very powerful result. This is a very important theorem,<br />

because, in essence, this property is what makes the connection with stochastic control, <strong>and</strong> Markov chains in a<br />

long time horizon. In particular, for a stationary control policy leading to a unique invariant distribution with<br />

bounded costs, it follows that, almost surely,<br />

1<br />

lim<br />

T →∞ T<br />

T∑<br />

c(x t , u t ) = ∑ c(x, u)µ(x, u),<br />

x,u<br />

t=1<br />

∀ real-valued bounded c. The ergodic theorem is what makes a dynamic optimization problem equivalent to a<br />

static optimization problem under mild technical conditions. This will set the core ideas in the convex analytic<br />

approach <strong>and</strong> the linear programming approach that will be discussed later.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!