07.08.2013 Views

beamer - Vrije Universiteit Amsterdam

beamer - Vrije Universiteit Amsterdam

beamer - Vrije Universiteit Amsterdam

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Stochastic Operations Research<br />

Lecture 4: Discrete-time Markov Chains (Part II)<br />

(Chapter 3)<br />

A.A.N. Ridder<br />

Department EOR<br />

<strong>Vrije</strong> <strong>Universiteit</strong> <strong>Amsterdam</strong><br />

Homepage: http://personal.vu.nl/a.a.n.ridder/sor/default.htm<br />

21 November 2012<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 1 / 36


Topics<br />

1. §3.3 Equilibrium Probabilities<br />

2. §3.3.3 Markov Reward Theorem<br />

3. §3.4.1 Computation by an Iterative Method<br />

4. §3.5 Theoretical Considerations<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 2 / 36


Example<br />

⎛<br />

⎜<br />

P = ⎜<br />

⎝<br />

0.6 0.4 0 0 0 0 0 0 0<br />

0 0 0.3 0.2 0 0.5 0 0 0<br />

0 0.1 0.1 0.4 0 0.2 0.2 0 0<br />

0 0 0 0 1 0 0 0 0<br />

0 0 0 1 0 0 0 0 0<br />

0 0 0 0 0 0 0 1 0<br />

0 0 0 0 0 0 0 1 0<br />

0 0 0 0 0 0 0 0.5 0.5<br />

0 0 0 0 0 0.3 0.7 0 0<br />

Transient states T = {1, 2, 3}; irreducible recurrent set R1 = {4, 5} with period<br />

2; irreducible aperiodic recurrent set R2 = {6, 7, 8, 9} (see picture next page).<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 3 / 36<br />

⎞<br />

⎟<br />


State-transition Diagram of Example<br />

0.3<br />

9 6 0.4<br />

4<br />

0.5 0.2<br />

0.5<br />

0.7<br />

2<br />

1.0<br />

8<br />

1.0<br />

7<br />

0.2 0.1 0.3<br />

5<br />

1.0<br />

0.4<br />

0.5<br />

0.2<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 4 / 36<br />

0.6<br />

1<br />

3<br />

0.1<br />

1.0


Recap<br />

▶ State j transient<br />

⇔ fjj = (ever returning to j|X0 = j) < 1<br />

∞∑<br />

⇔<br />

⇒<br />

n=1<br />

∞∑<br />

n=1<br />

p (n)<br />

jj = [number of visits to j|X0 = j] < ∞<br />

p (n)<br />

ij = [number of visits to j|X0 = i] < ∞ (for all i)<br />

⇒ lim<br />

n→∞ p(n) ij = 0 (for all i)<br />

▶ State j recurrent<br />

⇔ fjj = (ever returning to j|X0 = j) = 1<br />

∞∑<br />

⇔<br />

n=1<br />

p (n)<br />

jj = [number of visits to j|X0 = j] = ∞<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 5 / 36


Recurrent States<br />

.<br />

Theorem 3.5.3(a)<br />

.<br />

Suppose that R ⊂ I is an irreducible recurrent set of states; then fij = 1 for all<br />

i, . j ∈ R.<br />

Proof:<br />

1. According to Lemma 3.5.2 it holds that i and j communicate; i.e,<br />

> 0} ̸= ∅;<br />

{m : p (m)<br />

ji<br />

2. Let r = min{m : p (m)<br />

ji<br />

> 0}. Then<br />

∞∩<br />

0 = 1 − fjj = ( {Xn ̸= j}|X0 = j)<br />

≥ p (r)<br />

ji (<br />

n=1<br />

∞∩<br />

{Xn ̸= j}|X0 = i) = p (r)<br />

(1 − fij).<br />

n=1<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 6 / 36<br />

ji


Proper First-Passage Times<br />

.<br />

Corollary<br />

.<br />

fij = 1 if either<br />

(i). i, j ∈ R the same recurrent irreducible set;<br />

(ii). i ∈ T transient, j ∈ R recurrent irreducible set; and fiR = 1.<br />

.<br />

.<br />

Theorem 3.5.7(a)<br />

.<br />

Suppose that the unichain condition holds (see lecture 3) with |T| < ∞; then<br />

. fij = 1 for all i ∈ I and j ∈ R.<br />

Proof for i ∈ T:<br />

1 − fiR = (chain nevers reaches R|X0 = 1)<br />

= ( ∩∞<br />

)<br />

<br />

{Xk ∈ T} X0 = i = lim<br />

n→∞ P<br />

( ∩n<br />

)<br />

<br />

{Xk ∈ T} X0 = i<br />

k=1<br />

= lim<br />

n→∞ P(Xn ∈ T|X0 = i) = lim<br />

n→∞<br />

∑<br />

j∈T<br />

k=1<br />

p (n)<br />

ij<br />

finite sum<br />

=<br />

∑<br />

j∈T<br />

lim<br />

n→∞ p(n) ij<br />

= 0.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 7 / 36


Compute Powers of P<br />

Example slide 3:<br />

P 128 ⎛<br />

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164<br />

⎞<br />

⎜<br />

= ⎜<br />

⎝<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0.126<br />

0.064<br />

1.000<br />

0.000<br />

0.000<br />

0.000<br />

0.000<br />

0.219<br />

0.419<br />

0.000<br />

1.000<br />

0.000<br />

0.000<br />

0.000<br />

0.049<br />

0.039<br />

0.000<br />

0.000<br />

0.075<br />

0.075<br />

0.075<br />

0.115<br />

0.091<br />

0.000<br />

0.000<br />

0.175<br />

0.175<br />

0.175<br />

0.328<br />

0.259<br />

0.000<br />

0.000<br />

0.500<br />

0.500<br />

0.500<br />

0.164<br />

0.129<br />

0.000<br />

0.000<br />

0.250<br />

0.250<br />

0.250<br />

⎟<br />

⎠<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 8 / 36


Compute Powers of P<br />

⎛<br />

P 129 ⎜<br />

= ⎜<br />

⎝<br />

0 0 0 0.161 0.184 0.049 0.115 0.328 0.164<br />

0 0 0 0.219 0.126 0.049 0.115 0.328 0.164<br />

0 0 0 0.419 0.064 0.039 0.091 0.259 0.129<br />

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000<br />

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 9 / 36<br />

⎞<br />

⎟<br />


Compute Powers of P<br />

⎛<br />

P 130 ⎜<br />

= ⎜<br />

⎝<br />

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164<br />

0 0 0 0.126 0.219 0.049 0.115 0.328 0.164<br />

0 0 0 0.064 0.419 0.039 0.091 0.259 0.129<br />

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 10 / 36<br />

⎞<br />

⎟<br />


Observations<br />

We might conclude<br />

1. limn→∞ p (n)<br />

ij<br />

= 0 for transient j (and all i);<br />

2. limn→∞ p (n)<br />

ij = πj for i, j in the same aperiodic irreducible recurrent subset<br />

R;<br />

3. limn→∞ 1<br />

n<br />

4. Where ∑<br />

j∈R πj = 1;<br />

5. limn→∞ p (n)<br />

ij<br />

6. limn→∞ 1<br />

n<br />

∑n k=1 p(k) ij = πj for i, j in the same irreducible recurrent subset R;<br />

= fiRπj for j ∈ R an aperiodic irreducible recurrent subset, and<br />

i ̸∈ R (for instance transient);<br />

∑n k=1 p(k) ij = fiRπj for j ∈ R an irreducible recurrent subset, and<br />

i ̸∈ R (for instance transient);<br />

Property 1 is known from previous lecture; proofs (partly) of 2-6 on the<br />

following slides.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 11 / 36


Empirical State Averages<br />

Recall mean return time µjj = [τjj] = [inf{n : Xn = j}|X0 = j].<br />

.<br />

Equation (3.3.4)<br />

.<br />

For recurrent states j ∈ I<br />

. (with probability 1).<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

{Xk = j|X0 = j} = 1<br />

k=1<br />

Proof: let 0 = S0 < S1 < S2 < · · · be the consecutive returns to j. Due to the<br />

Markov property, the interreturn periods Tn = Sn − Sn−1 are IID as τjj. Due to<br />

recurrence, τjj is a proper random variable. Apply SLLN (or Lemma 2.2.2):<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

µjj<br />

n<br />

n<br />

{Xk = j|X0 = j} = lim = lim ∑n n→∞ Sn n→∞<br />

k=1 Tk<br />

Holds also for nulrecurrent states for which µjj = ∞!<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 12 / 36


Mean Return Times and Probabilities<br />

.<br />

Definition<br />

.<br />

For all states j ∈ I:<br />

.<br />

Note that<br />

πj = 1<br />

.<br />

▶ For transient and nulrecurrent states j: πj = 0;<br />

▶ For positive recurrent states j:<br />

µjj = 1 + ∑<br />

i̸=j<br />

µjj<br />

pji[inf{n : Xn = j}|X0 = i] ≥ 1 ⇒ 0 < πj ≤ 1<br />

▶ We will see that for irreducible positive recurrent sets R, (πj)j∈R forms a<br />

probability mass function; i.e., ∑<br />

j∈R<br />

πj = 1.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 13 / 36


Probabilistic Averages I<br />

.<br />

Theorem 3.3.1 (first part)<br />

.<br />

For all states j ∈ I<br />

.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

p (k)<br />

jj<br />

= πj<br />

Proof for recurrent j: apply bounded convergence (see book p. 439)<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

1<br />

= lim<br />

n→∞ n<br />

!<br />

= <br />

[<br />

lim<br />

n→∞<br />

p (k)<br />

jj<br />

n∑<br />

k=1<br />

1<br />

n<br />

1<br />

= lim<br />

n→∞ n<br />

n∑<br />

(Xk = j|X0 = j)<br />

k=1<br />

[{Xk = j|X0 = j}] = lim<br />

n→∞ <br />

n∑<br />

]<br />

{Xk = j|X0 = j}<br />

k=1<br />

Proof for transient j: limn→∞ p (n)<br />

jj<br />

[<br />

1<br />

n<br />

n∑<br />

]<br />

{Xk = j|X0 = j}<br />

k=1<br />

∑ 1 n<br />

= 0 ⇒ limn→∞<br />

n k=1 p(k) jj = 0.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 14 / 36


Probabilistic Averages II<br />

.<br />

Theorem 3.3.1 (second part)<br />

.<br />

For all states i, j ∈ I<br />

.<br />

Proof: apply (3.2.12):<br />

1<br />

n<br />

Take n → ∞.<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

( ∑n<br />

=<br />

ℓ=1<br />

= 1<br />

n<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k∑<br />

k=1 ℓ=1<br />

f (ℓ)<br />

)(<br />

n − ℓ<br />

ij<br />

n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= fijπj<br />

f (ℓ)<br />

ij p(k−ℓ) jj = 1<br />

n<br />

1 ∑n−ℓ<br />

n − ℓ<br />

k=0<br />

p (k)<br />

)<br />

jj<br />

n∑<br />

ℓ=1<br />

f (ℓ)<br />

ij<br />

n∑<br />

k=ℓ<br />

p (k−ℓ)<br />

jj<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 15 / 36


Probabilistic Averages III<br />

.<br />

Corollary<br />

.<br />

Suppose that R is an irreducible set of recurrent states. For all states i, j ∈ R<br />

1<br />

lim<br />

n→∞ n<br />

. (i.e., independent of the initial state provided that the initial state is in R).<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

Proof: fij = 1 for all i, j ∈ R (see slide 7). Apply previous slide.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 16 / 36


Probabilistic Averages IV<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞). Then for all states i, j ∈ I<br />

1<br />

lim<br />

n→∞ n<br />

. (i.e., independent of the initial state).<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

Proof: πj = 0 for transient j; and fij = 1 for recurrent j (see slide 7). Then<br />

apply slide 15.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 17 / 36


Finite Recurrent Sets<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with a finite irreducible set of recurrent<br />

states (|R| < ∞). Then<br />

∑<br />

πj = ∑<br />

πj = 1<br />

.<br />

Proof: for any i ∈ I and all n we have ∑<br />

j∈I p(n) ij = 1. Thus<br />

∑<br />

πj = ∑ (<br />

j∈I<br />

j∈I<br />

j∈I<br />

lim<br />

n→∞<br />

finite set ∑ 1<br />

= lim<br />

n→∞ n<br />

j∈I<br />

Use πj = 0 for transient j.<br />

1<br />

n<br />

n∑<br />

k=1<br />

n∑<br />

k=1<br />

j∈R<br />

p (k)<br />

)<br />

ij<br />

p (k)<br />

ij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑ ∑<br />

Note: all states in R are positive recurrent (see also slide 20); and (πj)j∈R<br />

forms a probability mass function on R.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 18 / 36<br />

k=1<br />

j∈I<br />

p (k)<br />

ij<br />

= 1


Infinite Recurrent Sets<br />

In the situation of the previous slide but with |R| = ∞. Without proof:<br />

▶ When R is nulrecurrent: πj = 0 for all j ∈ I;<br />

▶ When R is positive recurrent: ∑<br />

j∈I πj = ∑<br />

j∈R πj = 1; i.e., (πj)j∈R forms a<br />

probability mass function.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 19 / 36


Finite Irreducible Sets<br />

.<br />

Corollary<br />

.<br />

.Finite irreducible sets consist of positive recurrent states only.<br />

Proof: Let C be a finite irreducible set. That is,<br />

▶ For any i ∈ C it holds that ∑<br />

j∈C pij = 1; thus, also for all n, ∑<br />

j∈C p(n) ij = 1;<br />

▶ All states in C communicate;<br />

▶ All states have the same classification;<br />

Suppose transient or nulrecurrent, then all πj = 0. Gives a contradiction:<br />

0 = ∑<br />

πj = ∑ (<br />

j∈I<br />

j∈I<br />

finite set ∑ 1<br />

= lim<br />

n→∞ n<br />

j∈I<br />

lim<br />

n→∞<br />

n∑<br />

k=1<br />

1<br />

n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

p (k)<br />

)<br />

ij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑ ∑<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 20 / 36<br />

k=1<br />

j∈I<br />

p (k)<br />

ij<br />

= 1


Infinite Transient or Nonrecurrent Sets<br />

▶ Infinite irreducible sets can be transient or nulrecurrent.<br />

▶ Example of a Random Walk.<br />

▶ I = {0, 1, . . .}, p ∈ (0, 1), q = 1 − p.<br />

⎛<br />

0 1 0 . . .<br />

⎞<br />

⎜<br />

q<br />

⎜<br />

P = ⎜<br />

0<br />

⎜<br />

⎝<br />

0<br />

.<br />

0<br />

q<br />

0<br />

.<br />

p<br />

0<br />

q<br />

.<br />

0<br />

p<br />

0<br />

. ..<br />

. . .<br />

. . .<br />

p<br />

⎟<br />

. . . ⎟<br />

⎠<br />

. ..<br />

▶ If p > q the chain is transient; if q = p = 0.5 the chain is nulrecurrent; if<br />

p < q the chain is positive recurrent.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 21 / 36


Equilibrium<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞). For all states j ∈ I (and arbitrary r ∈ I):<br />

(πP)j = ∑<br />

πipij = ∑ (<br />

!<br />

= lim<br />

n→∞<br />

i∈I<br />

∑<br />

i∈I<br />

1<br />

= lim<br />

n→∞ n<br />

= lim<br />

n→∞<br />

1<br />

n<br />

n∑<br />

n∑<br />

k=1<br />

i∈I<br />

p<br />

k=1<br />

(k+1)<br />

rj<br />

1<br />

( ∑n+1<br />

p<br />

n<br />

k=1<br />

(k)<br />

rj − prj<br />

(<br />

n + 1<br />

= lim<br />

n→∞ n<br />

lim<br />

n→∞<br />

1<br />

n<br />

p (k)<br />

ri pij<br />

1<br />

= lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

1 ∑n+1<br />

= lim<br />

n→∞ n<br />

)<br />

1 ∑n+1<br />

n + 1<br />

k=1<br />

p (k)<br />

rj<br />

k=2<br />

p (k)<br />

)<br />

ri pij<br />

n∑ ∑<br />

k=1<br />

p (k)<br />

rj<br />

i∈I<br />

1<br />

−<br />

n prj<br />

)<br />

= πj<br />

p (k)<br />

ri pij<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 22 / 36


Equilibrium Distribution<br />

.<br />

Definition 3.3.2<br />

.<br />

A probability distribution (πj)j∈I is an equilibrium distribution for the Markov<br />

chain if<br />

πj =<br />

.<br />

∑<br />

πipij (j ∈ I)<br />

i∈I<br />

.<br />

Corollary<br />

.<br />

Let π be an equilibrium distribution for the Markov chain {Xn, n = 0, 1, . . .}<br />

and suppose that the chain starts in equilibrium, then it remains in<br />

equilibrium; i.e.,<br />

.<br />

(X0 = j) = πj ∀j ∈ I ⇒ (Xn = j) = πj ∀j ∈ I ∀n = 1, 2, . . .<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 23 / 36


Existence and Uniqueness of Equilibrium Distribution<br />

.<br />

Theorem 3.3.2 & Theorem 3.5.9<br />

.<br />

Assume the unichain condition with a finite transient set and a positive<br />

recurrent set (cf. Assumption 3.3.1), then the probabilistic long-run averages<br />

. (πj) (see slide 15) form the unique equilibrium distribution.<br />

Proof: for finite recurrent sets:<br />

▶ Existence follows from slides 22 and 18;<br />

▶ Uniqueness: suppose that (xj)j satisfies xj = ∑<br />

i∈I xipij. See page 128<br />

(book) for concluding that xj = cπj; i.e., π is the only distribution.<br />

Infinite recurrent sets: more involved.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 24 / 36


Limiting Probabilities<br />

.<br />

Equation (3.5.11)<br />

.<br />

A. For transient states j and any initial state i ∈ I<br />

.<br />

lim<br />

n→∞ p(n) ij = 0<br />

B. For recurrent aperiodic states j and any initial state i ∈ I<br />

lim<br />

n→∞ p(n) ij = fijπj<br />

Proof: see lecture 3 for A. Part B is more advanced; outside scope of this<br />

course.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 25 / 36


Unichain Case<br />

.<br />

Corollary<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with an irreducible set R of aperiodic<br />

recurrent states. Then for all states i, j ∈ I<br />

. (i.e., independent of the initial state).<br />

lim<br />

n→∞ p(n) ij = πj<br />

Proof: πj = 0 for for j ∈ T; and fij = 1 for j ∈ R (see slide 7). Then apply<br />

previous slide.<br />

Note: in case of positive recurrence π is a probability distribution ( ∑<br />

j<br />

πj = 1),<br />

see slide 24.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 26 / 36


Summary<br />

Suppose Assumption 3.3.1 (equivalently, unichain with finite transient set T<br />

and positive recurrent set R). Suppose aperiodicity of the recurrent states.<br />

Then<br />

1. There is a unique equilibrium distribution π = (πj)j∈I:<br />

2. πj = 0 for j ∈ T and πj > 0 for j ∈ R;<br />

3. π is the limiting distribution:<br />

π = πP<br />

lim<br />

n→∞ p(n) ij = πj<br />

4. π is the long-run average distribution:<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

k=1<br />

p (k)<br />

ij<br />

= πj<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 27 / 36


§3.3.3 Markov Chains with Rewards<br />

▶ Let f : I → be a reward or cost function;<br />

▶ ∑ n<br />

k=1 f (Xk) is the total reward up to time n;<br />

▶ limn→∞ 1<br />

n<br />

∑ n<br />

k=1 f (Xk) is the long-run average reward per unit of time;<br />

▶ We wish to have an ergodic (or Markov-reward) property<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

j∈I<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 28 / 36


Finite Unichain Case<br />

.<br />

Ergodic Theorem Finite Case<br />

.<br />

Suppose that the Markov chain satisfies the unichain condition with a finite<br />

set of transient states (|T| < ∞) and with a finite irreducible set of recurrent<br />

states (|R| < ∞). Then<br />

.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

Proof: let r be an arbitrary initial state<br />

1<br />

n<br />

n∑<br />

k=1<br />

f (Xk) = 1<br />

n<br />

k=1<br />

j∈I<br />

j∈I<br />

n∑ ∑<br />

{Xk = j|X0 = r}f (j) = ∑ (<br />

1<br />

n<br />

j∈I<br />

n∑<br />

k=1<br />

)<br />

{Xk = j|X0 = r} f (j).<br />

Take n → ∞; interchange limit and finite sum allowed; apply empirical state<br />

average property (slide 12).<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 29 / 36


Full Case<br />

.<br />

Ergodic Theorem 3.3.3 and 3.5.11<br />

.<br />

Assume<br />

.<br />

(i). Unichain with finite transient set;<br />

(ii). ∑<br />

j∈I |f (j)|πj < ∞.<br />

Then<br />

Proof: see book.<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

f (Xk) = ∑<br />

πjf (j) (w.p. 1)<br />

k=1<br />

j∈I<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 30 / 36


Expected Reward<br />

▶ See Remark 3.3.1 in book;<br />

▶ Result previous slide holds also for expected cost:<br />

1<br />

lim<br />

n→∞ n<br />

n∑<br />

[f (Xk)] = ∑<br />

πjf (j)<br />

k=1<br />

▶ Proof: apply dominated convergence.<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 31 / 36<br />

j∈I


§3.4.1 Computation of Equilibrium Probabilities<br />

▶ Given P finite, irreducible, thus positive recurrent;<br />

▶ Problem: compute the equilibrium distribution π;<br />

▶ Linear system { ∑<br />

i∈I πipij = πj (j ∈ I)<br />

∑<br />

j∈I πj = 1<br />

▶ In many applications we deal with large state spaces but sparse<br />

matrices P; Efficient to use an iterative method;<br />

▶ Jacobi or Gauss-Seidel;<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 32 / 36


Rewrite<br />

▶ Rewrite to classic form Ax = b;<br />

π T P = π T ⇔ (I − P T )π = 0<br />

▶ Delete a row (why?!) and add ∑<br />

j∈I πj = 1.<br />

▶ Example<br />

⎛<br />

0.4 0.4 0.0 0.2 0.0 0.0 0.0 0.0<br />

⎞<br />

0.0<br />

⎜<br />

0.0<br />

⎜<br />

⎜0.5<br />

⎜<br />

⎜0.0<br />

P = ⎜<br />

⎜0.0<br />

⎜<br />

⎜0.0<br />

⎜<br />

⎜0.6<br />

⎝0.0<br />

0.5<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.5<br />

0.3<br />

0.4<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.4<br />

0.0<br />

0.5<br />

0.0<br />

0.0<br />

0.2<br />

0.0<br />

0.3<br />

0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.1<br />

0.0<br />

0.0<br />

0.1<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.3<br />

0.5<br />

0.0<br />

0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.3<br />

0.0<br />

0.2<br />

0.4<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.4 ⎟<br />

0.0 ⎟<br />

0.1⎠<br />

0.0 0.0 0.7 0.0 0.0 0.0 0.0 0.0 0.3<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 33 / 36


Example (cont’d)<br />

⎛<br />

0.6 0.0 −0.5 0.0 0.0 0.0 −0.6 0.0<br />

⎞<br />

0.0<br />

⎜<br />

−0.4<br />

⎜ 0.0<br />

⎜<br />

⎜−0.2<br />

A = ⎜ 0.0<br />

⎜ 0.0<br />

⎜ 0.0<br />

⎝ 0.0<br />

0.5<br />

−0.3<br />

0.0<br />

−0.2<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

0.0<br />

0.0<br />

−0.1<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

−0.3<br />

0.0<br />

−0.3<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.8<br />

0.0<br />

−0.5<br />

−0.3<br />

0.0<br />

0.0<br />

−0.5<br />

0.0<br />

0.9<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.8<br />

−0.2<br />

−0.5<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.0<br />

0.6<br />

0.0 ⎟<br />

−0.7 ⎟<br />

0.0 ⎟<br />

0.0 ⎟<br />

0.4 ⎟<br />

0.0 ⎟<br />

0.1 ⎠<br />

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0<br />

⎛ ⎞<br />

0<br />

⎜<br />

0 ⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

b = ⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎜<br />

⎜0<br />

⎟<br />

⎝0⎠<br />

1<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 34 / 36


Gauss-Seidel Method<br />

▶ Construct a sequence vectors x (0) , x (1) , . . . by x (0)<br />

i<br />

k = 1, 2, . . .:<br />

x (k+1)<br />

i<br />

=<br />

(<br />

bi − ∑<br />

ji<br />

= 1/|I|, and for<br />

aijx (k)<br />

j<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 35 / 36<br />

)<br />

/aii.


Exercises<br />

Chapter 3 (pp 134 - 138):<br />

3.10, 3.11, 3.12, 3.14, 3.16, 3.17<br />

c⃝ Ad Ridder (VU) SOR– Fall 2012 36 / 36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!