Lecture Notes - Department of Mathematics and Statistics - Queen's ...

More documents

Recommendations

Info

22 CHAPTER 3. CLASSIFICATION OF MARKOV CHAINS Now, ||πP − π ′ P || = ||ψP − ψ ′ P || = ∑ | ∑ ψ(i)P(i, j) − ∑ j i k ψ ′ (k)P(k, j)| = 1 ∑ ||ψ ′ | ∑ ∑ ψ(i)ψ ′ (k)P(i, j) − ψ(i)ψ ′ (k)P(k, j)| (3.5) || 1 j k i ≤ 1 ∑∑ ∑ ||ψ ′ ψ(i)ψ ′ (k)|P(i, j) − P(k, j)| (3.6) || 1 j k i = 1 ∑∑ ||ψ ′ ψ(i)ψ ′ (k) ∑ |P(i, j) − P(k, j)| || 1 k i j = 1 ∑∑ ||ψ ′ |ψ(i)||ψ ′ (k)|{ ∑ P(i, j) + P(k, j) − 2 min(P(i, j), P(k, j))} || 1 k i j (3.7) ≤ 1 ∑∑ ||ψ ′ |ψ(i)||ψ ′ (k)|(2 − 2δ(P)) || 1 (3.8) k i = ||ψ ′ || 1 (2 − 2δ(P)) (3.9) = ||π − π ′ || 1 (1 − δ(P))} (3.10) In the above, (3.5) follows from adding terms in the summation, (3.6) from taking the norm inside, (3.7) follows from the relation ||a − b|| = a + b − 2 min(a, b), (3.8) from the definition of δ(P) and finally (3.9) follows from the l 1 norms of ψ, ψ ′ . As such, the process P is a contraction mapping if δ(P) > 0. In essence, one proves that such a sequence is Cauchy, and as every Cauchy sequence in a Banach space has a limit, this process also has a limit. In our setting {π 0 P m } is a Cauchy sequence. The limit is the invariant distribution. ⊓⊔ Dobrushin’s ergodic theorem provides a guarantee that the limit is unique. ⊓⊔ It should also be noted that Dobrushin’s theorem tells us how fast the sequence of probability distributions {π 0 P n } converges to the invariant distribution for any arbitrary π 0 . Ergodic Theorem for Countable State Space Chains For a Markov chain which has a unique invariant distribution µ(i), we have that almost surely 1 lim T →∞ T T∑ t=1 f(x t ) = ∑ i f(i)µ(i) ∀ f : X → R. This is called the ergodic theorem due to Birkhoff and is a very powerful result. This is a very important theorem, because, in essence, this property is what makes the connection with stochastic control, and Markov chains in a long time horizon. In particular, for a stationary control policy leading to a unique invariant distribution with bounded costs, it follows that, almost surely, 1 lim T →∞ T T∑ c(x t , u t ) = ∑ c(x, u)µ(x, u), x,u t=1 ∀ real-valued bounded c. The ergodic theorem is what makes a dynamic optimization problem equivalent to a static optimization problem under mild technical conditions. This will set the core ideas in the convex analytic approach and the linear programming approach that will be discussed later.
3.3. UNCOUNTABLE (COMPLETE, SEPARABLE, METRIC) STATE SPACES 23 3.3 Uncountable (Complete, Separable, Metric) State Spaces We now briefly extend the above definitions and discussions to an uncountable state space setting. Let {x t , t ∈ Z + } be a Markov chain with a complete, separable, metric state space (X, B(X), and defined on a probability space (Ω, F, P), where B(X) denotes the Borel σ−field on X, Ω is the sample space, F a sigma field of subsets of Ω, and P a probability measure. Let P(x, D) := P(x t+1 ∈ D|x t = x) denote the transition probability from x to D, that is the probability of the event {x t+1 ∈ D} given that x t = x. 3.3.1 Chapman-Kolmogorov Equations Consider a Markov chain with transition probability given by P(x, D), that is P(x t+1 ∈ D|x t = x) = P(x, D) We could compute P(x t+k ∈ D|x t = x) inductively as follows: ∫ P(x t+k ∈ D|x t = x) = ∫ ∫ . . . P(x t , dx t+1 )...P(x t+k−2 , dx t+k−1 )P(x t+k−1 , D) As such, we have for all n ≥ 1 ∫ P n (x, A) = P(x t+n ∈ A|x t = x) = X P n−1 (x, dy)P(y, A). The justification for the above construction will be discussed further with regard to conditional expectations in the next chapter. Definition 3.3.1 A Markov chain is µ-irreducible, if for any set B ∈ B(X), such that µ(B) > 0, and ∀x ∈ X, there exists some integer n > 0, possibly depending on B and x, such that P n (x, B) > 0, where P n (x, B) is the transition probability in n stages, that is P(x t+n ∈ B|x t = x). One popular case is with the Lebesgue measure: Definition 3.3.2 A Markov chain is Lebesgue-irreducible, if for any non-empty open set B ⊂ B(R), ∀x ∈ R, there exists some integer n > 0, possibly depending on B and x, such that P n (x, B) > 0, where P n (x, B) is the transition probability in n stages, that is P(x t+n ∈ B|x t = x). Example: Linear system with a drift. Consider the following linear system: x t+1 = ax t + w t , This chain is Lebesgue irreducible if w t is a Gaussian variable. A ϕ-irreducible Markov chain is aperiodic if for any x ∈ X, and any B ∈ B(X) satisfying ϕ(B) > 0, there exists n 0 = n 0 (x, B) such that P n (x, B) > 0 for all n ≥ n 0 . See Meyn and Tweedie [23] for a detailed discussion on periodicity of Markov chains and its connection with small sets, to be discussed shortly. The definitions for recurrence and transience follow those in the countable state space setting.
Page 1 and 2: i Queen’s University Mathematics
Page 3 and 4: Contents 1 Review of Probability 1
Page 5 and 6: CONTENTS v 5.1 Bellman’s Principl
Page 7 and 8: Chapter 1 Review of Probability 1.1
Page 9 and 10: 1.2. MEASURABLE SPACE 3 1.2.3 Measu
Page 11 and 12: 1.3. PROBABILITY SPACE AND RANDOM V
Page 13 and 14: 1.5. EXERCISES 7 where w t is an in
Page 15 and 16: Chapter 2 Controlled Markov Chains
Page 17 and 18: 2.3. PERFORMANCE OF POLICIES 11 The
Page 19 and 20: 2.4. EXERCISES 13 2.4 Exercises Exe
Page 21 and 22: Chapter 3 Classification of Markov
Page 23 and 24: 3.1. COUNTABLE STATE SPACE MARKOV C
Page 25 and 26: 3.2. STABILITY AND INVARIANT DISTRI
Page 27: 3.2. STABILITY AND INVARIANT DISTRI
Page 31 and 32: 3.3. UNCOUNTABLE (COMPLETE, SEPARAB
Page 33 and 34: 3.4. FURTHER RESULTS ON THE EXISTEN
Page 35 and 36: 3.5. EXERCISES 29 3.5 Exercises Exe
Page 37 and 38: Chapter 4 Martingales and Foster-Ly
Page 39 and 40: 4.1. MARTINGALES 33 Theorem 4.1.3 I
Page 41 and 42: 4.1. MARTINGALES 35 Theorem 4.1.6 L
Page 43 and 44: 4.2. STABILITY OF MARKOV CHAINS: FO
Page 49 and 50: 4.3. CONVERGENCE RATES TO EQUILIBRI
Page 51 and 52: 4.4. CONCLUSION 45 The second condi
Page 53 and 54: 4.5. EXERCISES 47 ∀ bounded funct
Page 55 and 56: Chapter 5 Dynamic Programming In th
Page 57 and 58: 5.2. DISCUSSION: WHY ARE MARKOV POL
Page 59 and 60: 5.3. EXISTENCE OF MINIMIZING SELECT
Page 61 and 62: 5.4. INFINITE HORIZON OPTIMAL CONTR
Page 63 and 64: 5.4. INFINITE HORIZON OPTIMAL CONTR
Page 65 and 66: 5.6. EXERCISES 59 Theorem 5.5.1 The
Page 67 and 68: 5.6. EXERCISES 61 with R, Q, Q T >
Page 69 and 70: Chapter 6 Partially Observed Markov
Page 71 and 72: 6.3. ESTIMATION AND KALMAN FILTERIN
Page 73 and 74: 6.3. ESTIMATION AND KALMAN FILTERIN
Page 75 and 76: 6.4. PARTIALLY OBSERVED MARKOV DECI
Page 77 and 78: Chapter 7 The Average Cost Problem
Page 79 and 80:
7.3. LINEAR PROGRAMMING APPROACH TO
Page 81 and 82:
7.3. LINEAR PROGRAMMING APPROACH TO
Page 83 and 84:
7.4. DISCUSSION FOR MORE GENERAL ST
Page 85 and 86:
7.5. EXERCISES 79 7.4.2 Sample-Path
Page 87 and 88:
Chapter 8 Team Decision Theory and
Page 89 and 90:
8.1. EXERCISES 83 u b t = f b (E[x
Page 91 and 92:
Appendix A On the Convergence of Ra
Page 93 and 94:
A.3. CONVERGENCE OF RANDOM VARIABLE
Page 95 and 96:
Bibliography [1] A. Arapostathis, V
show all

Lecture Notes - Department of Mathematics and Statistics - Queen's ...

Create successful ePaper yourself

Delete template?

Save as template?