07.08.2013 Views

beamer - Vrije Universiteit Amsterdam

beamer - Vrije Universiteit Amsterdam

beamer - Vrije Universiteit Amsterdam

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Stochastic Operations Research<br />

Lecture 4: Discrete-time Markov Chains (Part II)<br />

(Chapter 3)<br />

A.A.N. Ridder<br />

Department EOR<br />

<strong>Vrije</strong> <strong>Universiteit</strong> <strong>Amsterdam</strong><br />

Homepage: http://personal.vu.nl/a.a.n.ridder/sor/default.htm<br />

21 November 2012<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 1 / 36


Topics<br />

1. §3.3 Equilibrium Probabilities<br />

2. §3.3.3 Markov Reward Theorem<br />

3. §3.4.1 Computation by an Iterative Method<br />

4. §3.5 Theoretical Considerations<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 2 / 36


Example<br />

⎛<br />

⎜<br />

P = ⎜<br />

⎝<br />

0.6 0.4 0 0 0 0 0 0 0<br />

0 0 0.3 0.2 0 0.5 0 0 0<br />

0 0.1 0.1 0.4 0 0.2 0.2 0 0<br />

0 0 0 0 1 0 0 0 0<br />

0 0 0 1 0 0 0 0 0<br />

0 0 0 0 0 0 0 1 0<br />

0 0 0 0 0 0 0 1 0<br />

0 0 0 0 0 0 0 0.5 0.5<br />

0 0 0 0 0 0.3 0.7 0 0<br />

Transient states T = {1, 2, 3}; irreducible recurrent set R1 = {4, 5} with period<br />

2; irreducible aperiodic recurrent set R2 = {6, 7, 8, 9} (see picture next page).<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 3 / 36<br />

⎞<br />

⎟<br />


State-transition Diagram of Example<br />

0.3<br />

9 6 0.4<br />

4<br />

0.5 0.2<br />

0.5<br />

0.7<br />

2<br />

1.0<br />

8<br />

1.0<br />

7<br />

0.2 0.1 0.3<br />

5<br />

1.0<br />

0.4<br />

0.5<br />

0.2<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 4 / 36<br />

0.6<br />

1<br />

3<br />

0.1<br />

1.0


Recap<br />

▶ State j transient<br />

⇔ fjj = (ever returning to j|X0 = j) < 1<br />

∞∑<br />

⇔<br />

⇒<br />

n=1<br />

∞∑<br />

n=1<br />

p (n)<br />

jj = [number of visits to j|X0 = j] < ∞<br />

p (n)<br />

ij = [number of visits to j|X0 = i] < ∞ (for all i)<br />

⇒ lim<br />

n→∞ p(n) ij = 0 (for all i)<br />

▶ State j recurrent<br />

⇔ fjj = (ever returning to j|X0 = j) = 1<br />

∞∑<br />

⇔<br />

n=1<br />

p (n)<br />

jj = [number of visits to j|X0 = j] = ∞<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 5 / 36


Recurrent States<br />

.<br />

Theorem 3.5.3(a)<br />

.<br />

Suppose that R ⊂ I is an irreducible recurrent set of states; then fij = 1 for all<br />

i, . j ∈ R.<br />

Proof:<br />

1. According to Lemma 3.5.2 it holds that i and j communicate; i.e,<br />

> 0} ̸= ∅;<br />

{m : p (m)<br />

ji<br />

2. Let r = min{m : p (m)<br />

ji<br />

> 0}. Then<br />

∞∩<br />

0 = 1 − fjj = ( {Xn ̸= j}|X0 = j)<br />

≥ p (r)<br />

ji (<br />

n=1<br />

∞∩<br />

{Xn ̸= j}|X0 = i) = p (r)<br />

(1 − fij).<br />

n=1<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 6 / 36<br />

ji


Proper First-Passage Times<br />

.<br />

Corollary<br />

.<br />

fij = 1 if either<br />

(i). i, j ∈ R the same recurrent irreducible set;<br />

(ii). i ∈ T transient, j ∈ R recurrent irreducible set; and fiR = 1.<br />

.<br />

.<br />

Theorem 3.5.7(a)<br />

.<br />

Suppose that the unichain condition holds (see lecture 3) with |T| < ∞; then<br />

. fij = 1 for all i ∈ I and j ∈ R.<br />

Proof for i ∈ T:<br />

1 − fiR = (chain nevers reaches R|X0 = 1)<br />

= ( ∩∞<br />

)<br />

<br />

{Xk ∈ T} X0 = i = lim<br />

n→∞ P<br />

( ∩n<br />

)<br />

<br />

{Xk ∈ T} X0 = i<br />

k=1<br />

= lim<br />

n→∞ P(Xn ∈ T|X0 = i) = lim<br />

n→∞<br />

∑<br />

j∈T<br />

k=1<br />

p (n)<br />

ij<br />

finite sum<br />

=<br />

∑<br />

j∈T<br />

lim<br />

n→∞ p(n) ij<br />

= 0.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 7 / 36


Compute Powers of P<br />

Example slide 3:<br />

P 128 ⎛<br />

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164<br />

⎞<br />

⎜<br />

= ⎜<br />

⎝<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0.126<br />

0.064<br />

1.000<br />

0.000<br />

0.000<br />

0.000<br />

0.000<br />

0.219<br />

0.419<br />

0.000<br />

1.000<br />

0.000<br />

0.000<br />

0.000<br />

0.049<br />

0.039<br />

0.000<br />

0.000<br />

0.075<br />

0.075<br />

0.075<br />

0.115<br />

0.091<br />

0.000<br />

0.000<br />

0.175<br />

0.175<br />

0.175<br />

0.328<br />

0.259<br />

0.000<br />

0.000<br />

0.500<br />

0.500<br />

0.500<br />

0.164<br />

0.129<br />

0.000<br />

0.000<br />

0.250<br />

0.250<br />

0.250<br />

⎟<br />

⎠<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 8 / 36


Compute Powers of P<br />

⎛<br />

P 129 ⎜<br />

= ⎜<br />

⎝<br />

0 0 0 0.161 0.184 0.049 0.115 0.328 0.164<br />

0 0 0 0.219 0.126 0.049 0.115 0.328 0.164<br />

0 0 0 0.419 0.064 0.039 0.091 0.259 0.129<br />

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000<br />

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 9 / 36<br />

⎞<br />

⎟<br />


Compute Powers of P<br />

⎛<br />

P 130 ⎜<br />

= ⎜<br />

⎝<br />

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164<br />

0 0 0 0.126 0.219 0.049 0.115 0.328 0.164<br />

0 0 0 0.064 0.419 0.039 0.091 0.259 0.129<br />

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 10 / 36<br />

⎞<br />

⎟<br />


Observations<br />

We might conclude<br />

1. limn→∞ p (n)<br />

ij<br />

= 0 for transient j (and all i);<br />

2. limn→∞ p (n)<br />

ij = πj for i, j in the same aperiodic irreducible recurrent subset<br />

R;<br />

3. limn→∞ 1<br />

n<br />

4. Where ∑<br />

j∈R πj = 1;<br />

5. limn→∞ p (n)<br />

ij<br />

6. limn→∞ 1<br />

n<br />

∑n k=1 p(k) ij = πj for i, j in the same irreducible recurrent subset R;<br />

= fiRπj for j ∈ R an aperiodic irreducible recurrent subset, and<br />

i ̸∈ R (for instance transient);<br />

∑n k=1 p(k) ij = fiRπj for j ∈ R an irreducible recurrent subset, and<br />

i ̸∈ R (for instance transient);<br />

Property 1 is known from previous lecture; proofs (partly) of 2-6 on the<br />

following slides.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 11 / 36


Empirical State Averages<br />

Recall mean return time µjj = [τjj] = [inf{n : Xn = j}|X0 = j].<br />

.<br />

Equation (3.3.4)<br />

.<br />

For recurrent states j ∈ I<br />

. (with probability 1).<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

{Xk = j|X0 = j} = 1<br />

k=1<br />

Proof: let 0 = S0 < S1 < S2 < · · · be the consecutive returns to j. Due to the<br />

Markov property, the interreturn periods Tn = Sn − Sn−1 are IID as τjj. Due to<br />

recurrence, τjj is a proper random variable. Apply SLLN (or Lemma 2.2.2):<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

µjj<br />

n<br />

n<br />

{Xk = j|X0 = j} = lim = lim ∑n n→∞ Sn n→∞<br />

k=1 Tk<br />

Holds also for nulrecurrent states for which µjj = ∞!<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 12 / 36


Mean Return Times and Probabilities<br />

.<br />

Definition<br />

.<br />

For all states j ∈ I:<br />

.<br />

Note that<br />

πj = 1<br />

.<br />

▶ For transient and nulrecurrent states j: πj = 0;<br />

▶ For positive recurrent states j:<br />

µjj = 1 + ∑<br />

i̸=j<br />

µjj<br />

pji[inf{n : Xn = j}|X0 = i] ≥ 1 ⇒ 0 < πj ≤ 1<br />

▶ We will see that for irreducible positive recurrent sets R, (πj)j∈R forms a<br />

probability mass function; i.e., ∑<br />

j∈R<br />

πj = 1.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 13 / 36


Probabilistic Averages I<br />

.<br />

Theorem 3.3.1 (first part)<br />

.<br />

For all states j ∈ I<br />

.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

p (k)<br />

jj<br />

= πj<br />

Proof for recurrent j: apply bounded convergence (see book p. 439)<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

1<br />

= lim<br />

n→∞ n<br />

!<br />

= <br />

[<br />

lim<br />

n→∞<br />

p (k)<br />

jj<br />

n∑<br />

k=1<br />

1<br />

n<br />

1<br />

= lim<br />

n→∞ n<br />

n∑<br />

(Xk = j|X0 = j)<br />

k=1<br />

[{Xk = j|X0 = j}] = lim<br />

n→∞ <br />

n∑<br />

]<br />

{Xk = j|X0 = j}<br />

k=1<br />

Proof for transient j: limn→∞ p (n)<br />

jj<br />

[<br />

1<br />

n<br />

n∑<br />

]<br />

{Xk = j|X0 = j}<br />

k=1<br />

∑ 1 n<br />

= 0 ⇒ limn→∞<br />

n k=1 p(k) jj = 0.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 14 / 36


Probabilistic Averages II<br />

.<br />

Theorem 3.3.1 (second part)<br />

.<br />

For all states i, j ∈ I<br />

.<br />

Proof: apply (3.2.12):<br />

1<br />

n<br />

Take n → ∞.<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

( ∑n<br />

=<br />

ℓ=1<br />

= 1<br />

n<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k∑<br />

k=1 ℓ=1<br />

f (ℓ)<br />

)(<br />

n − ℓ<br />

ij<br />

n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= fijπj<br />

f (ℓ)<br />

ij p(k−ℓ) jj = 1<br />

n<br />

1 ∑n−ℓ<br />

n − ℓ<br />

k=0<br />

p (k)<br />

)<br />

jj<br />

n∑<br />

ℓ=1<br />

f (ℓ)<br />

ij<br />

n∑<br />

k=ℓ<br />

p (k−ℓ)<br />

jj<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 15 / 36


Probabilistic Averages III<br />

.<br />

Corollary<br />

.<br />

Suppose that R is an irreducible set of recurrent states. For all states i, j ∈ R<br />

1<br />

lim<br />

n→∞ n<br />

. (i.e., independent of the initial state provided that the initial state is in R).<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

Proof: fij = 1 for all i, j ∈ R (see slide 7). Apply previous slide.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 16 / 36


Probabilistic Averages IV<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞). Then for all states i, j ∈ I<br />

1<br />

lim<br />

n→∞ n<br />

. (i.e., independent of the initial state).<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

Proof: πj = 0 for transient j; and fij = 1 for recurrent j (see slide 7). Then<br />

apply slide 15.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 17 / 36


Finite Recurrent Sets<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with a finite irreducible set of recurrent<br />

states (|R| < ∞). Then<br />

∑<br />

πj = ∑<br />

πj = 1<br />

.<br />

Proof: for any i ∈ I and all n we have ∑<br />

j∈I p(n) ij = 1. Thus<br />

∑<br />

πj = ∑ (<br />

j∈I<br />

j∈I<br />

j∈I<br />

lim<br />

n→∞<br />

finite set ∑ 1<br />

= lim<br />

n→∞ n<br />

j∈I<br />

Use πj = 0 for transient j.<br />

1<br />

n<br />

n∑<br />

k=1<br />

n∑<br />

k=1<br />

j∈R<br />

p (k)<br />

)<br />

ij<br />

p (k)<br />

ij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑ ∑<br />

Note: all states in R are positive recurrent (see also slide 20); and (πj)j∈R<br />

forms a probability mass function on R.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 18 / 36<br />

k=1<br />

j∈I<br />

p (k)<br />

ij<br />

= 1


Infinite Recurrent Sets<br />

In the situation of the previous slide but with |R| = ∞. Without proof:<br />

▶ When R is nulrecurrent: πj = 0 for all j ∈ I;<br />

▶ When R is positive recurrent: ∑<br />

j∈I πj = ∑<br />

j∈R πj = 1; i.e., (πj)j∈R forms a<br />

probability mass function.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 19 / 36


Finite Irreducible Sets<br />

.<br />

Corollary<br />

.<br />

.Finite irreducible sets consist of positive recurrent states only.<br />

Proof: Let C be a finite irreducible set. That is,<br />

▶ For any i ∈ C it holds that ∑<br />

j∈C pij = 1; thus, also for all n, ∑<br />

j∈C p(n) ij = 1;<br />

▶ All states in C communicate;<br />

▶ All states have the same classification;<br />

Suppose transient or nulrecurrent, then all πj = 0. Gives a contradiction:<br />

0 = ∑<br />

πj = ∑ (<br />

j∈I<br />

j∈I<br />

finite set ∑ 1<br />

= lim<br />

n→∞ n<br />

j∈I<br />

lim<br />

n→∞<br />

n∑<br />

k=1<br />

1<br />

n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

p (k)<br />

)<br />

ij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑ ∑<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 20 / 36<br />

k=1<br />

j∈I<br />

p (k)<br />

ij<br />

= 1


Infinite Transient or Nonrecurrent Sets<br />

▶ Infinite irreducible sets can be transient or nulrecurrent.<br />

▶ Example of a Random Walk.<br />

▶ I = {0, 1, . . .}, p ∈ (0, 1), q = 1 − p.<br />

⎛<br />

0 1 0 . . .<br />

⎞<br />

⎜<br />

q<br />

⎜<br />

P = ⎜<br />

0<br />

⎜<br />

⎝<br />

0<br />

.<br />

0<br />

q<br />

0<br />

.<br />

p<br />

0<br />

q<br />

.<br />

0<br />

p<br />

0<br />

. ..<br />

. . .<br />

. . .<br />

p<br />

⎟<br />

. . . ⎟<br />

⎠<br />

. ..<br />

▶ If p > q the chain is transient; if q = p = 0.5 the chain is nulrecurrent; if<br />

p < q the chain is positive recurrent.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 21 / 36


Equilibrium<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞). For all states j ∈ I (and arbitrary r ∈ I):<br />

(πP)j = ∑<br />

πipij = ∑ (<br />

!<br />

= lim<br />

n→∞<br />

i∈I<br />

∑<br />

i∈I<br />

1<br />

= lim<br />

n→∞ n<br />

= lim<br />

n→∞<br />

1<br />

n<br />

n∑<br />

n∑<br />

k=1<br />

i∈I<br />

p<br />

k=1<br />

(k+1)<br />

rj<br />

1<br />

( ∑n+1<br />

p<br />

n<br />

k=1<br />

(k)<br />

rj − prj<br />

(<br />

n + 1<br />

= lim<br />

n→∞ n<br />

lim<br />

n→∞<br />

1<br />

n<br />

p (k)<br />

ri pij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

1 ∑n+1<br />

= lim<br />

n→∞ n<br />

)<br />

1 ∑n+1<br />

n + 1<br />

k=1<br />

p (k)<br />

rj<br />

k=2<br />

p (k)<br />

)<br />

ri pij<br />

n∑ ∑<br />

k=1<br />

p (k)<br />

rj<br />

i∈I<br />

1<br />

−<br />

n prj<br />

)<br />

= πj<br />

p (k)<br />

ri pij<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 22 / 36


Equilibrium Distribution<br />

.<br />

Definition 3.3.2<br />

.<br />

A probability distribution (πj)j∈I is an equilibrium distribution for the Markov<br />

chain if<br />

πj =<br />

.<br />

∑<br />

πipij (j ∈ I)<br />

i∈I<br />

.<br />

Corollary<br />

.<br />

Let π be an equilibrium distribution for the Markov chain {Xn, n = 0, 1, . . .}<br />

and suppose that the chain starts in equilibrium, then it remains in<br />

equilibrium; i.e.,<br />

.<br />

(X0 = j) = πj ∀j ∈ I ⇒ (Xn = j) = πj ∀j ∈ I ∀n = 1, 2, . . .<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 23 / 36


Existence and Uniqueness of Equilibrium Distribution<br />

.<br />

Theorem 3.3.2 & Theorem 3.5.9<br />

.<br />

Assume the unichain condition with a finite transient set and a positive<br />

recurrent set (cf. Assumption 3.3.1), then the probabilistic long-run averages<br />

. (πj) (see slide 15) form the unique equilibrium distribution.<br />

Proof: for finite recurrent sets:<br />

▶ Existence follows from slides 22 and 18;<br />

▶ Uniqueness: suppose that (xj)j satisfies xj = ∑<br />

i∈I xipij. See page 128<br />

(book) for concluding that xj = cπj; i.e., π is the only distribution.<br />

Infinite recurrent sets: more involved.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 24 / 36


Limiting Probabilities<br />

.<br />

Equation (3.5.11)<br />

.<br />

A. For transient states j and any initial state i ∈ I<br />

.<br />

lim<br />

n→∞ p(n) ij = 0<br />

B. For recurrent aperiodic states j and any initial state i ∈ I<br />

lim<br />

n→∞ p(n) ij = fijπj<br />

Proof: see lecture 3 for A. Part B is more advanced; outside scope of this<br />

course.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 25 / 36


Unichain Case<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with an irreducible set R of aperiodic<br />

recurrent states. Then for all states i, j ∈ I<br />

. (i.e., independent of the initial state).<br />

lim<br />

n→∞ p(n) ij = πj<br />

Proof: πj = 0 for for j ∈ T; and fij = 1 for j ∈ R (see slide 7). Then apply<br />

previous slide.<br />

Note: in case of positive recurrence π is a probability distribution ( ∑<br />

j<br />

πj = 1),<br />

see slide 24.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 26 / 36


Summary<br />

Suppose Assumption 3.3.1 (equivalently, unichain with finite transient set T<br />

and positive recurrent set R). Suppose aperiodicity of the recurrent states.<br />

Then<br />

1. There is a unique equilibrium distribution π = (πj)j∈I:<br />

2. πj = 0 for j ∈ T and πj > 0 for j ∈ R;<br />

3. π is the limiting distribution:<br />

π = πP<br />

lim<br />

n→∞ p(n) ij = πj<br />

4. π is the long-run average distribution:<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 27 / 36


§3.3.3 Markov Chains with Rewards<br />

▶ Let f : I → be a reward or cost function;<br />

▶ ∑ n<br />

k=1 f (Xk) is the total reward up to time n;<br />

▶ limn→∞ 1<br />

n<br />

∑ n<br />

k=1 f (Xk) is the long-run average reward per unit of time;<br />

▶ We wish to have an ergodic (or Markov-reward) property<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

j∈I<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 28 / 36


Finite Unichain Case<br />

.<br />

Ergodic Theorem Finite Case<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with a finite irreducible set of recurrent<br />

states (|R| < ∞). Then<br />

.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

Proof: let r be an arbitrary initial state<br />

1<br />

n<br />

n∑<br />

k=1<br />

f (Xk) = 1<br />

n<br />

k=1<br />

j∈I<br />

j∈I<br />

n∑ ∑<br />

{Xk = j|X0 = r}f (j) = ∑ (<br />

1<br />

n<br />

j∈I<br />

n∑<br />

k=1<br />

)<br />

{Xk = j|X0 = r} f (j).<br />

Take n → ∞; interchange limit and finite sum allowed; apply empirical state<br />

average property (slide 12).<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 29 / 36


Full Case<br />

.<br />

Ergodic Theorem 3.3.3 and 3.5.11<br />

.<br />

Assume<br />

.<br />

(i). Unichain with finite transient set;<br />

(ii). ∑<br />

j∈I |f (j)|πj < ∞.<br />

Then<br />

Proof: see book.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

j∈I<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 30 / 36


Expected Reward<br />

▶ See Remark 3.3.1 in book;<br />

▶ Result previous slide holds also for expected cost:<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

[f (Xk)] = ∑<br />

πjf (j)<br />

k=1<br />

▶ Proof: apply dominated convergence.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 31 / 36<br />

j∈I


§3.4.1 Computation of Equilibrium Probabilities<br />

▶ Given P finite, irreducible, thus positive recurrent;<br />

▶ Problem: compute the equilibrium distribution π;<br />

▶ Linear system { ∑<br />

i∈I πipij = πj (j ∈ I)<br />

∑<br />

j∈I πj = 1<br />

▶ In many applications we deal with large state spaces but sparse<br />

matrices P; Efficient to use an iterative method;<br />

▶ Jacobi or Gauss-Seidel;<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 32 / 36


Rewrite<br />

▶ Rewrite to classic form Ax = b;<br />

π T P = π T ⇔ (I − P T )π = 0<br />

▶ Delete a row (why?!) and add ∑<br />

j∈I πj = 1.<br />

▶ Example<br />

⎛<br />

0.4 0.4 0.0 0.2 0.0 0.0 0.0 0.0<br />

⎞<br />

0.0<br />

⎜<br />

0.0<br />

⎜<br />

⎜0.5<br />

⎜<br />

⎜0.0<br />

P = ⎜<br />

⎜0.0<br />

⎜<br />

⎜0.0<br />

⎜<br />

⎜0.6<br />

⎝0.0<br />

0.5<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.5<br />

0.3<br />

0.4<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.4<br />

0.0<br />

0.5<br />

0.0<br />

0.0<br />

0.2<br />

0.0<br />

0.3<br />

0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.1<br />

0.0<br />

0.0<br />

0.1<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.3<br />

0.5<br />

0.0<br />

0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.3<br />

0.0<br />

0.2<br />

0.4<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.4 ⎟<br />

0.0 ⎟<br />

0.1⎠<br />

0.0 0.0 0.7 0.0 0.0 0.0 0.0 0.0 0.3<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 33 / 36


Example (cont’d)<br />

⎛<br />

0.6 0.0 −0.5 0.0 0.0 0.0 −0.6 0.0<br />

⎞<br />

0.0<br />

⎜<br />

−0.4<br />

⎜ 0.0<br />

⎜<br />

⎜−0.2<br />

A = ⎜ 0.0<br />

⎜ 0.0<br />

⎜ 0.0<br />

⎝ 0.0<br />

0.5<br />

−0.3<br />

0.0<br />

−0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

0.0<br />

0.0<br />

−0.1<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

−0.3<br />

0.0<br />

−0.3<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.8<br />

0.0<br />

−0.5<br />

−0.3<br />

0.0<br />

0.0<br />

−0.5<br />

0.0<br />

0.9<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.8<br />

−0.2<br />

−0.5<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

0.0 ⎟<br />

−0.7 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.4 ⎟<br />

0.0 ⎟<br />

0.1 ⎠<br />

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0<br />

⎛ ⎞<br />

0<br />

⎜<br />

0 ⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

b = ⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎝0⎠<br />

1<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 34 / 36


Gauss-Seidel Method<br />

▶ Construct a sequence vectors x (0) , x (1) , . . . by x (0)<br />

i<br />

k = 1, 2, . . .:<br />

x (k+1)<br />

i<br />

=<br />

(<br />

bi − ∑<br />

ji<br />

= 1/|I|, and for<br />

aijx (k)<br />

j<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 35 / 36<br />

)<br />

/aii.


Exercises<br />

Chapter 3 (pp 134 - 138):<br />

3.10, 3.11, 3.12, 3.14, 3.16, 3.17<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 36 / 36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!