Student Notes To Accompany MS4214: STATISTICAL INFERENCE

More documents

Recommendations

Info

2.8 Optimality Properties of the MLE Suppose that an experiment consists of measuring random variables x1, x2, . . . , xn which are iid with probability distribution depending on a parameter θ. Let ˆ θ be the MLE of θ. Define W1 = � E[I(θ)]( ˆ θ − θ) W2 = � I(θ)( ˆ θ − θ) � W3 = E[I( ˆ θ)]( ˆ θ − θ) � W4 = I( ˆ θ)( ˆ θ − θ). Then, W1, W2, W3, and W4 are all random variables and, as n → ∞, the probabilistic behaviour of each of W1, W2, W3, and W4 is well approximated by that of a N(0, 1) random variable. Then, since E[W1] ≈ 0, we have that E[ ˆ θ] ≈ θ and so ˆ θ is approximately unbiased. Also Var[W1] ≈ 1 implies that Var[ ˆ θ] ≈ (E[I(θ)]) −1 and so ˆ θ is approximately efficient. Let the data X have probability distribution g(X; θ) where θ = (θ1, θ2, . . . , θm) is a vector of m unknown parameters. Let I(θ) be the m×m information matrix as defined above and let E[I(θ)] be the m × m matrix obtained by replacing the elements of I(θ) by their expected values. Let ˆ θ be the MLE of θ. Let CRLBr be the rth diagonal element of [E[I(θ)]] −1 . For r = 1, 2, . . . , m, define W1r = ( ˆ θr − θr)/ √ CRLBr. Then, as n → ∞, W1r behaves like a standard normal random variable. Suppose we define W2r by replacing CRLBr by the rth diagonal element of the matrix [I(θ)] −1 , W3r by replacing CRLBr by the rth diagonal element of the matrix [EI( ˆ θ)] −1 and W4r by replacing CRLBr by the rth diagonal element of the matrix [I( ˆ θ)] −1 . Then it can be shown that as n → ∞, W2r, W3r, and W4r all behave like standard normal random variables. 2.9 Data Reduction Definition 2.11 (Sufficiency). Consider a statistic T = t(X) that summarises the data so that no information about θ is lost. Then we call t(X) a sufficient statistic. � Example 2.12. T = t(X) = ¯ X is sufficient for µ when Xi ∼ iid N(µ, σ 2 ). � <strong>To</strong> better understand the motivation behind the concept of sufficiency consider three independent Binomial trials where θ = P (X = 1). 39
Event Probability Set 0 0 0 (1 − θ) 3 A0 1 0 0 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 θ(1 − θ) 2 A1 θ 2 (1 − θ) A2 1 1 1 θ 3 A4 Knowing which Ai the sample is in carries all the information about θ. Which particular sample within Ai gives us no extra information about θ. Extra information about other aspect of the model maybe, but not about θ. Here T = t(X) = � Xi equals the number of “successes”, and identifies Ai. Mathematically we can express the above concept by saying that the probability P (X = x|Ai) does not depend on θ. i.e. P (010|A1; θ) = 1/3. More generally, a statistic T = t(X) is said to be sufficient for the parameter θ if Pr (X = x|T = t) does not depend on θ. Sufficient statistics are most easily recognized through the following fundamental result: Theorem 2.11 (Neyman’s Factorization Criterion). A statistic T = t(X) is sufficient for θ if and only if the family of densities can be factorized as f(x; θ) = h(x)k {t(x); θ} , x ∈ X , θ ∈ Θ. (2.14) i.e. into a function which does not depend on θ and one which only depends on x through t(x). This is true in general. We will prove it in the case where X is discrete. Proof. Assume T is sufficient and let h(x) = Pθ {X = x|T = t(x)} be independent of θ. Let k {t; θ} = Pθ(T = t). Then f(x; θ) = Pθ {X = x|T = t(x)} Pθ {T = t(x)} = h(x)k {θ, t(x)} . Conversely assume the result in (2.14) to be true. Then Pθ (X = x|T = t) = which is independent of θ. = = h(x)k {t(x); θ} 1{x:t(x)=t}(x), h(y)k {t(y); θ} � y:t(y)=t h(x)k {t; θ} k {t; θ} � 1{x:t(x)=t}(x), y:t(y)=t h(y) h(x) � y:t(y)=t 40 h(y) 1{x:t(x)=t}(x),
Page 1 and 2: Student Notes To Accompany MS4214:
Page 3 and 4: 4.8 Worked Problems . . . . . . . .
Page 5 and 6: Suppose that, in order to learn som
Page 7 and 8: Example 1.4 (Blood pressure). We wi
Page 9 and 10: Chapter 2 The Theory of Estimation
Page 11 and 12: Thus if we carried out an infinite
Page 13 and 14: Problem 2.1. Let X have a binomial
Page 15 and 16: Problem 2.6. Let X1, . . . , Xn be
Page 17 and 18: Theorem 2.8 (Cramér Rao lower boun
Page 19 and 20: Let us propose ˆµ = ¯ X as an es
Page 21 and 22: Example 2.2 (Bernoulli Trials). Con
Page 23 and 24: Relative Likelihood 0.0 0.2 0.4 0.6
Page 25 and 26: Example 2.7 (Exponential distributi
Page 27 and 28: ℓ(µ) = −m ln µ − 1 m� ti
Page 29 and 30: 2.5 Multi-parameter Estimation Supp
Page 31 and 32: Lemma 2.9 (Joint distribution of th
Page 33 and 34: So, if | ˆ θ − θ0| is small, w
Page 35 and 36: Then the log-likelihood function is
Page 37 and 38: A regression of the empirical distr
Page 39: 2.7 The Invariance Principle How do
Page 43 and 44: 2.10 Worked Problems The Problems 1
Page 45 and 46: (a) Show that ˆµ is an unbiased e
Page 47 and 48: 4. E(X2 ) = � ∞ 0 x2f(x)dx =
Page 49 and 50: Student Questions 1. Let X1, X2, .
Page 51 and 52: Chapter 3 The Theory of Confidence
Page 53 and 54: Suppose that we have data X1, X2, .
Page 55 and 56: Lemma 3.2 (The Student t-distributi
Page 57 and 58: Example 3.6. Suppose that we have d
Page 59 and 60: 3.3 Approximate Confidence Interval
Page 61 and 62: 3.4 Worked Problems The Problems 1.
Page 63 and 64: Student Questions 1. Let X1, X2, .
Page 65 and 66: Chapter 4 The Theory of Hypothesis
Page 67 and 68: Example 4.3 (The power function). S
Page 69 and 70: (d) Suppose Θ0 = {(µ, σ) : −
Page 71 and 72: (c) Θ0 = {(µ, µ,σ) : −∞ <
Page 73 and 74: The Score Test Statistic: This test
Page 75 and 76: 4.5 The Neyman-Pearson Lemma Suppos
Page 77 and 78: According to the Neyman-Pearson lem
Page 79 and 80: would consider to be an unusually l
Page 81 and 82: pendent we would expect the proport
Page 83 and 84: 3. Explain what is meant by the pow
Page 85 and 86: For the null hypothesis θ = 1, the
Page 87 and 88: 0.1820. Similarly P (X = 3) = 0.218
Page 89 and 90: Appendix A Review of Probability A.
Page 91 and 92:
where Xi ∼ Bernoulli(θ) for i =
Page 93 and 94:
The expected value of the jth momen
Page 95 and 96:
A.3 Continuous Random Variables A.3
Page 97 and 98:
A.3.4 Gaussian Distribution A rando
Page 99 and 100:
A.3.5 Weibull Distribution The Weib
Page 101 and 102:
Next, let g(y) be a monotone decrea
Page 103 and 104:
That is, X + Y ∼ Pois(θ + λ). =
Page 105 and 106:
since � ∞ −∞ e−αu2 du =
Page 107 and 108:
A.4.4 The Bivariate Normal Distribu
Page 109 and 110:
A.5 Generating Functions Denote the
Page 111 and 112:
Density p.g.f. m.g.f. ch.f. c.g.f B
Page 113:
Uniform U(a, b) Continuous Distribu
show all

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

Create successful ePaper yourself

Delete template?

Save as template?