Student Notes To Accompany MS4214: STATISTICAL INFERENCE

More documents

Recommendations

Info

Definition 2.8 (Score function). For the (possibly vector valued) observation X = x to be informative about θ, the density must vary with θ. If f(x|θ) is smooth and differentiable, this change is quantified to first order by the score function S(θ) = ∂ ∂θ ln f(x|θ) ≡ f ′ (x|θ) f(x|θ) . Under suitable regularity conditions (differentiation wrt θ and integration wrt x can be interchanged), we have E{S(θ)} = = � ′ f (x|θ) f(x|θ)dx = f(x|θ) � ∂ �� f(x|θ)dx = ∂θ ∂ ∂θ f ′ (x|θ)dx , 1 = 0. Thus the score function has expectation zero. � True frequentism evaluates the properties of estimators based on their “long-run” behaviour. The value of x will vary from sample to sample so we have treated the score function as a random variable and looked at its average across all possible samples. Lemma 2.7 (Fisher information). The variance of S(θ) is the expected Fisher information about θ Proof. Using the chain rule I(θ) = E{S(θ) 2 } ≡ E ∂2 ∂ ln f = ∂θ2 ∂θ � � 1 ∂f f ∂θ = − 1 f 2 � ∂f ∂θ � ∂ ln f = − ∂f �� 2 ∂ ln f(x|θ) ∂θ � 2 � 2 + 1 ∂ f 2f ∂θ2 + 1 ∂ f 2f ∂θ2 If integration and differentiation can be interchanged � 1 ∂ E f 2f ∂θ2 � � = ∂2f ∂2 dx = ∂θ2 ∂θ2 � dx = ∂2 1 = 0, ∂θ2 thus X X � � �� 2 ∂ ∂ −E ln f(x|θ) = E ln f(x|θ) = I(θ). (2.1) ∂θ2 ∂θ Variance measures lack of knowledge. Reasonable that the reciprocal of the variance should be defined as the amount of information carried by the (possibly vector valued) observation x about θ. 15
Theorem 2.8 (Cramér Rao lower bound). Let ˆ θ be an unbiased estimator of θ. Then Var( ˆ θ) ≥ { I(θ) } −1 . Proof. Unbiasedness, E( ˆ θ) = θ, implies � ˆθ(x)f(x|θ)dx = θ. Assume we can differentiate wrt θ under the integral, then � ∂ � � ˆθ(x)f(x|θ)dx = 1. ∂θ The estimator ˆ θ(x) can’t depend on θ, so � ˆθ(x) ∂ {f(x|θ)dx} = 1. ∂θ For any pdf f, so that now � Thus Define random variables ∂f ∂θ ∂ = f (ln f) , ∂θ ˆθ(x)f ∂ (ln f) dx = 1. ∂θ � E ˆθ(x) ∂ � (ln f) = 1. ∂θ U = ˆ θ(x), and S = ∂ (ln f) . ∂θ Then E (US) = 1. We already know that the score function has expectation zero, E (S) = 0. Consequently Cov(U, S) = E(US) − E(U)E(S) = E(US) = 1. Setting Cov(U, S) = 1 we get This implies {Corr(U, S)} 2 = {Cov(U, S)}2 Var(U)Var(S) Var(U)Var(S) ≥ 1 Var( ˆ θ) ≥ 1 I(θ) ≤ 1 which is our main result. We call { I(θ) } −1 the Cramér Rao lower bound (CRLB). 16
Page 1 and 2: Student Notes To Accompany MS4214:
Page 3 and 4: 4.8 Worked Problems . . . . . . . .
Page 5 and 6: Suppose that, in order to learn som
Page 7 and 8: Example 1.4 (Blood pressure). We wi
Page 9 and 10: Chapter 2 The Theory of Estimation
Page 11 and 12: Thus if we carried out an infinite
Page 13 and 14: Problem 2.1. Let X have a binomial
Page 15: Problem 2.6. Let X1, . . . , Xn be
Page 19 and 20: Let us propose ˆµ = ¯ X as an es
Page 21 and 22: Example 2.2 (Bernoulli Trials). Con
Page 23 and 24: Relative Likelihood 0.0 0.2 0.4 0.6
Page 25 and 26: Example 2.7 (Exponential distributi
Page 27 and 28: ℓ(µ) = −m ln µ − 1 m� ti
Page 29 and 30: 2.5 Multi-parameter Estimation Supp
Page 31 and 32: Lemma 2.9 (Joint distribution of th
Page 33 and 34: So, if | ˆ θ − θ0| is small, w
Page 35 and 36: Then the log-likelihood function is
Page 37 and 38: A regression of the empirical distr
Page 39 and 40: 2.7 The Invariance Principle How do
Page 41 and 42: Event Probability Set 0 0 0 (1 −
Page 43 and 44: 2.10 Worked Problems The Problems 1
Page 45 and 46: (a) Show that ˆµ is an unbiased e
Page 47 and 48: 4. E(X2 ) = � ∞ 0 x2f(x)dx =
Page 49 and 50: Student Questions 1. Let X1, X2, .
Page 51 and 52: Chapter 3 The Theory of Confidence
Page 53 and 54: Suppose that we have data X1, X2, .
Page 55 and 56: Lemma 3.2 (The Student t-distributi
Page 57 and 58: Example 3.6. Suppose that we have d
Page 59 and 60: 3.3 Approximate Confidence Interval
Page 61 and 62: 3.4 Worked Problems The Problems 1.
Page 63 and 64: Student Questions 1. Let X1, X2, .
Page 65 and 66: Chapter 4 The Theory of Hypothesis
Page 67 and 68:
Example 4.3 (The power function). S
Page 69 and 70:
(d) Suppose Θ0 = {(µ, σ) : −
Page 71 and 72:
(c) Θ0 = {(µ, µ,σ) : −∞ <
Page 73 and 74:
The Score Test Statistic: This test
Page 75 and 76:
4.5 The Neyman-Pearson Lemma Suppos
Page 77 and 78:
According to the Neyman-Pearson lem
Page 79 and 80:
would consider to be an unusually l
Page 81 and 82:
pendent we would expect the proport
Page 83 and 84:
3. Explain what is meant by the pow
Page 85 and 86:
For the null hypothesis θ = 1, the
Page 87 and 88:
0.1820. Similarly P (X = 3) = 0.218
Page 89 and 90:
Appendix A Review of Probability A.
Page 91 and 92:
where Xi ∼ Bernoulli(θ) for i =
Page 93 and 94:
The expected value of the jth momen
Page 95 and 96:
A.3 Continuous Random Variables A.3
Page 97 and 98:
A.3.4 Gaussian Distribution A rando
Page 99 and 100:
A.3.5 Weibull Distribution The Weib
Page 101 and 102:
Next, let g(y) be a monotone decrea
Page 103 and 104:
That is, X + Y ∼ Pois(θ + λ). =
Page 105 and 106:
since � ∞ −∞ e−αu2 du =
Page 107 and 108:
A.4.4 The Bivariate Normal Distribu
Page 109 and 110:
A.5 Generating Functions Denote the
Page 111 and 112:
Density p.g.f. m.g.f. ch.f. c.g.f B
Page 113:
Uniform U(a, b) Continuous Distribu
show all

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

Create successful ePaper yourself

Delete template?

Save as template?