beamer - Vrije Universiteit Amsterdam

Stochastic Operations Research 

Lecture 4: Discrete-time Markov Chains (Part II) 

(Chapter 3) 

A.A.N. Ridder 

Department EOR 

Vrije Universiteit Amsterdam 

Homepage: http://personal.vu.nl/a.a.n.ridder/sor/default.htm 

21 November 2012 

c⃝ Ad Ridder (VU) SOR– Fall 2012 1 / 36

Topics 

1. §3.3 Equilibrium Probabilities 

2. §3.3.3 Markov Reward Theorem 

3. §3.4.1 Computation by an Iterative Method 

4. §3.5 Theoretical Considerations 


Example 

⎛ 

⎜ 

P = ⎜ 

⎝ 

0.6 0.4 0 0 0 0 0 0 0 

0 0 0.3 0.2 0 0.5 0 0 0 

0 0.1 0.1 0.4 0 0.2 0.2 0 0 

0 0 0 0 1 0 0 0 0 

0 0 0 1 0 0 0 0 0 

0 0 0 0 0 0 0 1 0 

0 0 0 0 0 0 0 1 0 

0 0 0 0 0 0 0 0.5 0.5 

0 0 0 0 0 0.3 0.7 0 0 

Transient states T = {1, 2, 3}; irreducible recurrent set R1 = {4, 5} with period 

2; irreducible aperiodic recurrent set R2 = {6, 7, 8, 9} (see picture next page). 

c⃝ Ad Ridder (VU) SOR– Fall 2012 3 / 36 

⎞ 

⎟ 

⎠

State-transition Diagram of Example 

0.3 

9 6 0.4 

4 

0.5 0.2 

0.5 

0.7 

2 

1.0 

8 

1.0 

7 

0.2 0.1 0.3 

5 

1.0 

0.4 

0.5 

0.2 


0.6 

1 

3 

0.1 

1.0

Recap 

▶ State j transient 

⇔ fjj = (ever returning to j|X0 = j) < 1 

∞∑ 

⇔ 

⇒ 

n=1 

∞∑ 

n=1 

p (n) 

jj = [number of visits to j|X0 = j] < ∞ 

p (n) 

ij = [number of visits to j|X0 = i] < ∞ (for all i) 

⇒ lim 

n→∞ p(n) ij = 0 (for all i) 

▶ State j recurrent 

⇔ fjj = (ever returning to j|X0 = j) = 1 

∞∑ 

⇔ 

n=1 

p (n) 

jj = [number of visits to j|X0 = j] = ∞ 


Recurrent States 

. 

Theorem 3.5.3(a) 

. 

Suppose that R ⊂ I is an irreducible recurrent set of states; then fij = 1 for all 

i, . j ∈ R. 

Proof: 

1. According to Lemma 3.5.2 it holds that i and j communicate; i.e, 

> 0} ̸= ∅; 

{m : p (m) 

ji 

2. Let r = min{m : p (m) 

ji 

> 0}. Then 

∞∩ 

0 = 1 − fjj = ( {Xn ̸= j}|X0 = j) 

≥ p (r) 

ji ( 

n=1 

∞∩ 

{Xn ̸= j}|X0 = i) = p (r) 

(1 − fij). 

n=1 


ji

Proper First-Passage Times 

. 

Corollary 

. 

fij = 1 if either 

(i). i, j ∈ R the same recurrent irreducible set; 

(ii). i ∈ T transient, j ∈ R recurrent irreducible set; and fiR = 1. 

. 

. 

Theorem 3.5.7(a) 

. 

Suppose that the unichain condition holds (see lecture 3) with |T| < ∞; then 

. fij = 1 for all i ∈ I and j ∈ R. 

Proof for i ∈ T: 

1 − fiR = (chain nevers reaches R|X0 = 1) 

= ( ∩∞ 

) 

 

{Xk ∈ T} X0 = i = lim 

n→∞ P 

( ∩n 

) 

 

{Xk ∈ T} X0 = i 

k=1 

= lim 

n→∞ P(Xn ∈ T|X0 = i) = lim 

n→∞ 

∑ 

j∈T 

k=1 

p (n) 

ij 

finite sum 

= 

∑ 

j∈T 

lim 

n→∞ p(n) ij 

= 0. 


Compute Powers of P 

Example slide 3: 

P 128 ⎛ 

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164 

⎞ 

⎜ 

= ⎜ 

⎝ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.126 

0.064 

1.000 

0.000 

0.000 

0.000 

0.000 

0.219 

0.419 

0.000 

1.000 

0.000 

0.000 

0.000 

0.049 

0.039 

0.000 

0.000 

0.075 

0.075 

0.075 

0.115 

0.091 

0.000 

0.000 

0.175 

0.175 

0.175 

0.328 

0.259 

0.000 

0.000 

0.500 

0.500 

0.500 

0.164 

0.129 

0.000 

0.000 

0.250 

0.250 

0.250 

⎟ 

⎠ 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 



⎛ 

P 129 ⎜ 

= ⎜ 

⎝ 

0 0 0 0.161 0.184 0.049 0.115 0.328 0.164 

0 0 0 0.219 0.126 0.049 0.115 0.328 0.164 

0 0 0 0.419 0.064 0.039 0.091 0.259 0.129 

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000 

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 


⎞ 

⎟ 

⎠


⎛ 

P 130 ⎜ 

= ⎜ 

⎝ 

0 0 0 0.184 0.161 0.049 0.115 0.328 0.164 

0 0 0 0.126 0.219 0.049 0.115 0.328 0.164 

0 0 0 0.064 0.419 0.039 0.091 0.259 0.129 

0 0 0 1.000 0.000 0.000 0.000 0.000 0.000 

0 0 0 0.000 1.000 0.000 0.000 0.000 0.000 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 

0 0 0 0.000 0.000 0.075 0.175 0.500 0.250 


⎞ 

⎟ 

⎠

Observations 

We might conclude 

1. limn→∞ p (n) 

ij 

= 0 for transient j (and all i); 

2. limn→∞ p (n) 

ij = πj for i, j in the same aperiodic irreducible recurrent subset 

R; 

3. limn→∞ 1 

n 

4. Where ∑ 

j∈R πj = 1; 

5. limn→∞ p (n) 

ij 

6. limn→∞ 1 

n 

∑n k=1 p(k) ij = πj for i, j in the same irreducible recurrent subset R; 

= fiRπj for j ∈ R an aperiodic irreducible recurrent subset, and 

i ̸∈ R (for instance transient); 

∑n k=1 p(k) ij = fiRπj for j ∈ R an irreducible recurrent subset, and 

i ̸∈ R (for instance transient); 

Property 1 is known from previous lecture; proofs (partly) of 2-6 on the 

following slides. 


Empirical State Averages 

Recall mean return time µjj = [τjj] = [inf{n : Xn = j}|X0 = j]. 

. 

Equation (3.3.4) 

. 

For recurrent states j ∈ I 

. (with probability 1). 

1 

lim 

n→∞ n 

n∑ 

{Xk = j|X0 = j} = 1 

k=1 

Proof: let 0 = S0 < S1 < S2 < · · · be the consecutive returns to j. Due to the 

Markov property, the interreturn periods Tn = Sn − Sn−1 are IID as τjj. Due to 

recurrence, τjj is a proper random variable. Apply SLLN (or Lemma 2.2.2): 

1 

lim 

n→∞ n 

n∑ 

k=1 

µjj 

n 

n 

{Xk = j|X0 = j} = lim = lim ∑n n→∞ Sn n→∞ 

k=1 Tk 

Holds also for nulrecurrent states for which µjj = ∞! 


Mean Return Times and Probabilities 

. 

Definition 

. 

For all states j ∈ I: 

. 

Note that 

πj = 1 

. 

▶ For transient and nulrecurrent states j: πj = 0; 

▶ For positive recurrent states j: 

µjj = 1 + ∑ 

i̸=j 

µjj 

pji[inf{n : Xn = j}|X0 = i] ≥ 1 ⇒ 0 < πj ≤ 1 

▶ We will see that for irreducible positive recurrent sets R, (πj)j∈R forms a 

probability mass function; i.e., ∑ 

j∈R 

πj = 1. 


Probabilistic Averages I 

. 

Theorem 3.3.1 (first part) 

. 

For all states j ∈ I 

. 

1 

lim 

n→∞ n 

n∑ 

k=1 

p (k) 

jj 

= πj 

Proof for recurrent j: apply bounded convergence (see book p. 439) 

1 

lim 

n→∞ n 

n∑ 

k=1 

1 

= lim 

n→∞ n 

! 

= 

[ 

lim 

n→∞ 

p (k) 

jj 

n∑ 

k=1 

1 

n 

1 

= lim 

n→∞ n 

n∑ 

(Xk = j|X0 = j) 

k=1 

[{Xk = j|X0 = j}] = lim 

n→∞ 

n∑ 

] 

{Xk = j|X0 = j} 

k=1 

Proof for transient j: limn→∞ p (n) 

jj 

[ 

1 

n 

n∑ 

] 

{Xk = j|X0 = j} 

k=1 

∑ 1 n 

= 0 ⇒ limn→∞ 

n k=1 p(k) jj = 0. 


Probabilistic Averages II 

. 

Theorem 3.3.1 (second part) 

. 

For all states i, j ∈ I 

. 

Proof: apply (3.2.12): 

1 

n 

Take n → ∞. 

n∑ 

k=1 

p (k) 

ij 

( ∑n 

= 

ℓ=1 

= 1 

n 

1 

lim 

n→∞ n 

n∑ 

k∑ 

k=1 ℓ=1 

f (ℓ) 

)( 

n − ℓ 

ij 

n 

n∑ 

k=1 

p (k) 

ij 

= fijπj 

f (ℓ) 

ij p(k−ℓ) jj = 1 

n 

1 ∑n−ℓ 

n − ℓ 

k=0 

p (k) 

) 

jj 

n∑ 

ℓ=1 

f (ℓ) 

ij 

n∑ 

k=ℓ 

p (k−ℓ) 

jj 


Probabilistic Averages III 

. 

Corollary 

. 

Suppose that R is an irreducible set of recurrent states. For all states i, j ∈ R 

1 

lim 

n→∞ n 

. (i.e., independent of the initial state provided that the initial state is in R). 

n∑ 

k=1 

p (k) 

ij 

= πj 

Proof: fij = 1 for all i, j ∈ R (see slide 7). Apply previous slide. 


Probabilistic Averages IV 

. 

Corollary 

. 

Suppose that the Markov chain satisfies the unichain condition with a finite 

set of transient states (|T| < ∞). Then for all states i, j ∈ I 

1 

lim 

n→∞ n 

. (i.e., independent of the initial state). 

n∑ 

k=1 

p (k) 

ij 

= πj 

Proof: πj = 0 for transient j; and fij = 1 for recurrent j (see slide 7). Then 

apply slide 15. 


Finite Recurrent Sets 

. 

Corollary 

. 


set of transient states (|T| < ∞) and with a finite irreducible set of recurrent 

states (|R| < ∞). Then 

∑ 

πj = ∑ 

πj = 1 

. 

Proof: for any i ∈ I and all n we have ∑ 

j∈I p(n) ij = 1. Thus 

∑ 

πj = ∑ ( 

j∈I 

j∈I 

j∈I 

lim 

n→∞ 

finite set ∑ 1 

= lim 

n→∞ n 

j∈I 

Use πj = 0 for transient j. 

1 

n 

n∑ 

k=1 

n∑ 

k=1 

j∈R 

p (k) 

) 

ij 

p (k) 

ij 

1 

= lim 

n→∞ n 

n∑ ∑ 

Note: all states in R are positive recurrent (see also slide 20); and (πj)j∈R 

forms a probability mass function on R. 


k=1 

j∈I 

p (k) 

ij 

= 1

Infinite Recurrent Sets 

In the situation of the previous slide but with |R| = ∞. Without proof: 

▶ When R is nulrecurrent: πj = 0 for all j ∈ I; 

▶ When R is positive recurrent: ∑ 

j∈I πj = ∑ 

j∈R πj = 1; i.e., (πj)j∈R forms a 

probability mass function. 


Finite Irreducible Sets 

. 

Corollary 

. 

.Finite irreducible sets consist of positive recurrent states only. 

Proof: Let C be a finite irreducible set. That is, 

▶ For any i ∈ C it holds that ∑ 

j∈C pij = 1; thus, also for all n, ∑ 

j∈C p(n) ij = 1; 

▶ All states in C communicate; 

▶ All states have the same classification; 

Suppose transient or nulrecurrent, then all πj = 0. Gives a contradiction: 

0 = ∑ 

πj = ∑ ( 

j∈I 

j∈I 

finite set ∑ 1 

= lim 

n→∞ n 

j∈I 

lim 

n→∞ 

n∑ 

k=1 

1 

n 

n∑ 

k=1 

p (k) 

ij 

p (k) 

) 

ij 

1 

= lim 

n→∞ n 

n∑ ∑ 


k=1 

j∈I 

p (k) 

ij 

= 1

Infinite Transient or Nonrecurrent Sets 

▶ Infinite irreducible sets can be transient or nulrecurrent. 

▶ Example of a Random Walk. 

▶ I = {0, 1, . . .}, p ∈ (0, 1), q = 1 − p. 

⎛ 

0 1 0 . . . 

⎞ 

⎜ 

q 

⎜ 

P = ⎜ 

0 

⎜ 

⎝ 

0 

. 

0 

q 

0 

. 

p 

0 

q 

. 

0 

p 

0 

. .. 

. . . 

. . . 

p 

⎟ 

. . . ⎟ 

⎠ 

. .. 

▶ If p > q the chain is transient; if q = p = 0.5 the chain is nulrecurrent; if 

p < q the chain is positive recurrent. 


Equilibrium 


set of transient states (|T| < ∞). For all states j ∈ I (and arbitrary r ∈ I): 

(πP)j = ∑ 

πipij = ∑ ( 

! 

= lim 

n→∞ 

i∈I 

∑ 

i∈I 

1 

= lim 

n→∞ n 

= lim 

n→∞ 

1 

n 

n∑ 

n∑ 

k=1 

i∈I 

p 

k=1 

(k+1) 

rj 

1 

( ∑n+1 

p 

n 

k=1 

(k) 

rj − prj 

( 

n + 1 

= lim 

n→∞ n 

lim 

n→∞ 

1 

n 

p (k) 

ri pij 

1 

= lim 

n→∞ n 

n∑ 

k=1 

1 ∑n+1 

= lim 

n→∞ n 

) 

1 ∑n+1 

n + 1 

k=1 

p (k) 

rj 

k=2 

p (k) 

) 

ri pij 

n∑ ∑ 

k=1 

p (k) 

rj 

i∈I 

1 

− 

n prj 

) 

= πj 

p (k) 

ri pij 


Equilibrium Distribution 

. 

Definition 3.3.2 

. 

A probability distribution (πj)j∈I is an equilibrium distribution for the Markov 

chain if 

πj = 

. 

∑ 

πipij (j ∈ I) 

i∈I 

. 

Corollary 

. 

Let π be an equilibrium distribution for the Markov chain {Xn, n = 0, 1, . . .} 

and suppose that the chain starts in equilibrium, then it remains in 

equilibrium; i.e., 

. 

(X0 = j) = πj ∀j ∈ I ⇒ (Xn = j) = πj ∀j ∈ I ∀n = 1, 2, . . . 


Existence and Uniqueness of Equilibrium Distribution 

. 

Theorem 3.3.2 & Theorem 3.5.9 

. 

Assume the unichain condition with a finite transient set and a positive 

recurrent set (cf. Assumption 3.3.1), then the probabilistic long-run averages 

. (πj) (see slide 15) form the unique equilibrium distribution. 

Proof: for finite recurrent sets: 

▶ Existence follows from slides 22 and 18; 

▶ Uniqueness: suppose that (xj)j satisfies xj = ∑ 

i∈I xipij. See page 128 

(book) for concluding that xj = cπj; i.e., π is the only distribution. 

Infinite recurrent sets: more involved. 


Limiting Probabilities 

. 

Equation (3.5.11) 

. 

A. For transient states j and any initial state i ∈ I 

. 

lim 

n→∞ p(n) ij = 0 

B. For recurrent aperiodic states j and any initial state i ∈ I 

lim 

n→∞ p(n) ij = fijπj 

Proof: see lecture 3 for A. Part B is more advanced; outside scope of this 

course. 


Unichain Case 

. 

Corollary 

. 


set of transient states (|T| < ∞) and with an irreducible set R of aperiodic 

recurrent states. Then for all states i, j ∈ I 

. (i.e., independent of the initial state). 

lim 

n→∞ p(n) ij = πj 

Proof: πj = 0 for for j ∈ T; and fij = 1 for j ∈ R (see slide 7). Then apply 

previous slide. 

Note: in case of positive recurrence π is a probability distribution ( ∑ 

j 

πj = 1), 

see slide 24. 


Summary 

Suppose Assumption 3.3.1 (equivalently, unichain with finite transient set T 

and positive recurrent set R). Suppose aperiodicity of the recurrent states. 

Then 

1. There is a unique equilibrium distribution π = (πj)j∈I: 

2. πj = 0 for j ∈ T and πj > 0 for j ∈ R; 

3. π is the limiting distribution: 

π = πP 

lim 

n→∞ p(n) ij = πj 

4. π is the long-run average distribution: 

1 

lim 

n→∞ n 

n∑ 

k=1 

p (k) 

ij 

= πj 


§3.3.3 Markov Chains with Rewards 

▶ Let f : I → be a reward or cost function; 

▶ ∑ n 

k=1 f (Xk) is the total reward up to time n; 

▶ limn→∞ 1 

n 

∑ n 

k=1 f (Xk) is the long-run average reward per unit of time; 

▶ We wish to have an ergodic (or Markov-reward) property 

1 

lim 

n→∞ n 

n∑ 

f (Xk) = ∑ 

πjf (j) (w.p. 1) 

k=1 

j∈I 


Finite Unichain Case 

. 

Ergodic Theorem Finite Case 

. 


set of transient states (|T| < ∞) and with a finite irreducible set of recurrent 

states (|R| < ∞). Then 

. 

1 

lim 

n→∞ n 

n∑ 

f (Xk) = ∑ 

πjf (j) (w.p. 1) 

k=1 

Proof: let r be an arbitrary initial state 

1 

n 

n∑ 

k=1 

f (Xk) = 1 

n 

k=1 

j∈I 

j∈I 

n∑ ∑ 

{Xk = j|X0 = r}f (j) = ∑ ( 

1 

n 

j∈I 

n∑ 

k=1 

) 

{Xk = j|X0 = r} f (j). 

Take n → ∞; interchange limit and finite sum allowed; apply empirical state 

average property (slide 12). 


Full Case 

. 

Ergodic Theorem 3.3.3 and 3.5.11 

. 

Assume 

. 

(i). Unichain with finite transient set; 

(ii). ∑ 

j∈I |f (j)|πj < ∞. 

Then 

Proof: see book. 

1 

lim 

n→∞ n 

n∑ 

f (Xk) = ∑ 

πjf (j) (w.p. 1) 

k=1 

j∈I 


Expected Reward 

▶ See Remark 3.3.1 in book; 

▶ Result previous slide holds also for expected cost: 

1 

lim 

n→∞ n 

n∑ 

[f (Xk)] = ∑ 

πjf (j) 

k=1 

▶ Proof: apply dominated convergence. 


j∈I

§3.4.1 Computation of Equilibrium Probabilities 

▶ Given P finite, irreducible, thus positive recurrent; 

▶ Problem: compute the equilibrium distribution π; 

▶ Linear system { ∑ 

i∈I πipij = πj (j ∈ I) 

∑ 

j∈I πj = 1 

▶ In many applications we deal with large state spaces but sparse 

matrices P; Efficient to use an iterative method; 

▶ Jacobi or Gauss-Seidel; 


Rewrite 

▶ Rewrite to classic form Ax = b; 

π T P = π T ⇔ (I − P T )π = 0 

▶ Delete a row (why?!) and add ∑ 

j∈I πj = 1. 

▶ Example 

⎛ 

0.4 0.4 0.0 0.2 0.0 0.0 0.0 0.0 

⎞ 

0.0 

⎜ 

0.0 

⎜ 

⎜0.5 

⎜ 

⎜0.0 

P = ⎜ 

⎜0.0 

⎜ 

⎜0.0 

⎜ 

⎜0.6 

⎝0.0 

0.5 

0.0 

0.0 

0.0 

0.0 

0.0 

0.5 

0.3 

0.4 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.4 

0.0 

0.5 

0.0 

0.0 

0.2 

0.0 

0.3 

0.2 

0.0 

0.0 

0.0 

0.0 

0.1 

0.0 

0.0 

0.1 

0.0 

0.0 

0.0 

0.0 

0.3 

0.5 

0.0 

0.2 

0.0 

0.0 

0.0 

0.0 

0.3 

0.0 

0.2 

0.4 

0.0 ⎟ 

0.0 ⎟ 

0.0 ⎟ 

0.0 ⎟ 

0.4 ⎟ 

0.0 ⎟ 

0.1⎠ 

0.0 0.0 0.7 0.0 0.0 0.0 0.0 0.0 0.3 


Example (cont’d) 

⎛ 

0.6 0.0 −0.5 0.0 0.0 0.0 −0.6 0.0 

⎞ 

0.0 

⎜ 

−0.4 

⎜ 0.0 

⎜ 

⎜−0.2 

A = ⎜ 0.0 

⎜ 0.0 

⎜ 0.0 

⎝ 0.0 

0.5 

−0.3 

0.0 

−0.2 

0.0 

0.0 

0.0 

0.0 

0.6 

0.0 

0.0 

−0.1 

0.0 

0.0 

0.0 

0.0 

0.6 

−0.3 

0.0 

−0.3 

0.0 

0.0 

0.0 

0.0 

0.8 

0.0 

−0.5 

−0.3 

0.0 

0.0 

−0.5 

0.0 

0.9 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.8 

−0.2 

−0.5 

0.0 

0.0 

0.0 

0.0 

0.0 

0.6 

0.0 ⎟ 

−0.7 ⎟ 

0.0 ⎟ 

0.0 ⎟ 

0.4 ⎟ 

0.0 ⎟ 

0.1 ⎠ 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

⎛ ⎞ 

0 

⎜ 

0 ⎟ 

⎜ 

⎜0 

⎟ 

⎜ 

⎜0 

⎟ 

b = ⎜ 

⎜0 

⎟ 

⎜ 

⎜0 

⎟ 

⎜ 

⎜0 

⎟ 

⎝0⎠ 

1 


Gauss-Seidel Method 

▶ Construct a sequence vectors x (0) , x (1) , . . . by x (0) 

i 

k = 1, 2, . . .: 

x (k+1) 

i 

= 

( 

bi − ∑ 

ji 

= 1/|I|, and for 

aijx (k) 

j 


) 

/aii.

Exercises 

Chapter 3 (pp 134 - 138): 

3.10, 3.11, 3.12, 3.14, 3.16, 3.17

beamer - Vrije Universiteit Amsterdam

Create successful ePaper yourself

Delete template?

Save as template?