Student Notes To Accompany MS4214: STATISTICAL INFERENCE

More documents

Recommendations

Info

Outline Solutions 1. The cdf is F (t) = P (T ≤ t) = � t 0 λe−λυ dυ = λ � − 1 λ e−λυ� t 0 = 1 − e−λυ . Next P (a ≤ T ≤ b) = F (b) − F (a) = e −λa − e −λb . Assume all settlements are inde- pendent. Then P (50 in first week) = {F (1)} 50 = (1 − e −λ ) 50 , because T ≤ 1 for these 50 settlements. Likewise, 1 < T ≤ 2, for the 35 in the second week, so we have P (35 in second week) = {F (2) − F (1)} 35 = (e −λ − e −2λ ) 35 . The re- maining 15 have T > 2, which has probability 1 − P (T ≤ 2) = e −2λ , and thus P (15 after week two) = (e −2λ ) 15 . The likelihood function is therefore the product L(λ) = (1 − e −λ ) 50 (e −λ − e −2λ ) 35 (e −2λ ) 15 . Taking logarithms (always based e), ∴ d dλ ln L(λ) = 50 ln(1 − e −λ ) + 35 ln � e −λ (1 − e −λ ) � + 15 ln(e −2λ ) = 85 ln(1 − e −λ ) − (35 + 30)λ = 85 ln(1 − e −λ ) − 65λ. 85e−λ 85 ln L(λ) = − 65 = 1 − e−λ e−λ − 65. − 1 Equating to zero, 85 = 65(e −λ − 1) or e λ = 150/65, so that ˆ λ = ln(150/65) = 0.836. This is indeed a maximum; e.g. d2 dλ 2 ln L(λ) = −85/(e λ − 1) 2 < 0 ∀ λ. Next 1 − e −0.836 = 0.5666; e −0.836 − e −1.672 = 0.43344 − 0.18787 = 0.2456. Hence out of 100 invoices, 56.66, 24.56 and 18.78 would be expected to be paid, on this model, in weeks 1, 2 and later. The actual numbers were 50, 35 and 15. The prediction for the second week is a long way from what happened, balanced by smaller discrepancies in the other two periods. This does not seem very satisfactory. 2. nA + nB = n ⇒ nB = n − nA. Suppose we observe the sequence A, B, A, B, A, then L1(nA) = nA n × n−nA n−1 × nA−1 n−2 × n−nA−1 n−3 sequence A, A, B, B, A, then L2(nA) = nA n nA−2 × . Next, suppose we observe the n−4 × nA−1 n−1 × n−nA n−2 × n−nA−1 n−3 nA−2 × n−4 . If it is known that 3As and 2Bs are drawn but the exact sequence is unknown then L3(nA) = P (Y = y|nA) = � �� nA n−nA n / , where y = 3. This third likelihood y 5−y 5 function expands to give L3(nA) = 10 × nA(nA−1)(nA−2)(n−nA)(n−nA−1) . Clearly n(n−1)(n−2)(n−3)(n−4) L1(nA) = L2(nA) = L3(nA) ÷ 10. The first two likelihood functions are identical. The third likelihood function is a constant times the other two, and as only the ratio of likelihood functions are meaningful, L3(nA) carries the same information about our preferences for the parameter nA as the other functions. 3. L(θ) = 4 −n (2 + θ) a (1 − θ) b+c (θ) d so, ℓ(θ) = −n ln(4)+a ln(2+θ)+(b+c) ln(1− θ) + d ln(θ). Differentiating we get S(θ) = a b+c d − + and setting S(θ) = 0 2+θ 1−θ θ leads to the quadratic equation nθ2 − {a − 2b − 2c − d}θ − 2d = 0, of which the positive root, ˆ θ, satisfies the condition of maximum likelihood. If S(θ) is differentiated again with respect to θ, and expected values substituted for a, b, c, and d, we obtain Var( ˆ θ) ≈ {E[I(θ)]} −1 = {I(θ)} −1 = 2θ(1−θ)(2+θ) (1+2θ)n 45 .
4. E(X2 ) = � ∞ 0 x2f(x)dx = � � 2 = πθ �� 2 n/2 ℓ(θ) = ln exp − πθ � � 2 ∞ πθ 0 x2e−x2 /2θdx = x[−θe−x2 /2θ ∞ ] 0 + � ∞ 0 θe−x2 /2θdx � X 2 i 2θ Setting this equal to zero gives ˆ θ = 1 n � 2 � � 2 ∞ πθ 0 x � xe−x2 � /2θ dx � = πθ [0 − 0] + θ � ∞ f(x)dx = θ. 0 �� = n 2 ln � � � 2 X2 i − so S(θ) = − πθ 2θ n 1 + 2θ 2θ2 � 2 Xi . � 2 Xi . It may be verified (by considering the second derivative) that this is indeed a maximum, and so is the MLE of θ. Since E(X 2 ) = θ, we immediately have E( ˆ θ) = θ, i.e. ˆ θ is unbiased for θ. Next, I(θ) = − n 2θ 2 + nθ θ 3 = n 2θ 2 , so the Cramér-Rao lower bound is 2θ 2 /n. Now, Var( ˆ θ) = 2θ 2 /n (using the hint), and so the variance of ˆ θ attains the bound. φ = √ θ; so MLE of φ is ˆ φ = √ MLE of θ = �� X 2 i /n. Because φ is a non-linear transformation of θ, and ˆ θ is unbiased for θ, ˆ φ cannot be unbiased for φ. 5. L(θ) = θ� X ie −nθ � Xi! , giving ℓ(θ) = ( � Xi) ln θ − nθ − ln( � Xi!) and S(θ) = � Xi so the MLE of θ is ˆ θ = 1 n � Xi = ¯ X. Also I(θ) = � Xi θ 2 θ − n, > 0 ∀ θ, so ˆ θ is indeed a maximum. By the “invariance property”, the MLE of λ = e −θ is ˆ λ = e −ˆ θ = e − ¯ X . The delta method gives that the variance of g( ˆ θ) is approximated by � � dg 2 Var( θ) ˆ dθ evaluated at the mean of the distribution, which here is simply θ. So we need to obtain θ n � dg dθ � 2 with g(θ) = e −θ . This immediately gives dg dθ = −e−θ , so the approximate variance is θ n (−e−θ ) 2 = 1 n θe−2θ . The number of zero observations is binomially distributed with p = e −θ = λ, i.e. Bin(n, λ). Thus ˜ λ, the proportion of zeros, has expected value λ, i.e. it is unbiased. Also we have Var( ˜ λ) = 1 1 λ(1 − λ) = n ne−θ (1 − e−θ ). Using the approximate variance from part (ii), the efficiency of ˜ λ relative to ˆ λ is given approximately by θe −2θ n n e θ (1−e θ ) θ = eθ . If θ is small, the efficiency is near (but less than) unity; as −1 θ increases, the efficiency decreases; as θ becomes large, the efficiency tends to 0. 6. E( ¯ X) = E[(X1+X2+· · ·+Xn)/n] = [E(X1)+E(X2)+· · ·+E(Xn)]/n = nµ/n = µ. The distribution of Xi−µ is symmetric implying E[(X−µ)(p+1)] = E[(µ−X)(p+1)]. Combining this with the identity X(p+1) − µ = (µ − X)(p+1) = −{(µ − X)(p+1)} yields E[X(p+1) − µ] = E[(X − µ)(p+1)] = E[(µ − X)(p+1)] = −E[(µ − X)(p+1)] = 0 and hence that E(˜µ) = E[X(p+1)] = µ. Note that the argument applies to any distribution which is symmetric around µ. For any positive random variable Y the identity Var( √ Y ) = E(Y ) − [E( √ Y )] 2 holds, so for Y = 1 n � n i (Xi − ¯ X) 2 = SSD/n, clearly, Var( √ Y ) > 0, and it follows that E(ˆσ) = E( � (SSD)/n) < � E(SSD)/n) = σ, hence ˆσ is not unbiased. For n = 3, Z = SSD/σ 2 follows a χ 2 -distribution with 2 degrees of freedom, 46
Page 1 and 2: Student Notes To Accompany MS4214:
Page 3 and 4: 4.8 Worked Problems . . . . . . . .
Page 5 and 6: Suppose that, in order to learn som
Page 7 and 8: Example 1.4 (Blood pressure). We wi
Page 9 and 10: Chapter 2 The Theory of Estimation
Page 11 and 12: Thus if we carried out an infinite
Page 13 and 14: Problem 2.1. Let X have a binomial
Page 15 and 16: Problem 2.6. Let X1, . . . , Xn be
Page 17 and 18: Theorem 2.8 (Cramér Rao lower boun
Page 19 and 20: Let us propose ˆµ = ¯ X as an es
Page 21 and 22: Example 2.2 (Bernoulli Trials). Con
Page 23 and 24: Relative Likelihood 0.0 0.2 0.4 0.6
Page 25 and 26: Example 2.7 (Exponential distributi
Page 27 and 28: ℓ(µ) = −m ln µ − 1 m� ti
Page 29 and 30: 2.5 Multi-parameter Estimation Supp
Page 31 and 32: Lemma 2.9 (Joint distribution of th
Page 33 and 34: So, if | ˆ θ − θ0| is small, w
Page 35 and 36: Then the log-likelihood function is
Page 37 and 38: A regression of the empirical distr
Page 39 and 40: 2.7 The Invariance Principle How do
Page 41 and 42: Event Probability Set 0 0 0 (1 −
Page 43 and 44: 2.10 Worked Problems The Problems 1
Page 45: (a) Show that ˆµ is an unbiased e
Page 49 and 50: Student Questions 1. Let X1, X2, .
Page 51 and 52: Chapter 3 The Theory of Confidence
Page 53 and 54: Suppose that we have data X1, X2, .
Page 55 and 56: Lemma 3.2 (The Student t-distributi
Page 57 and 58: Example 3.6. Suppose that we have d
Page 59 and 60: 3.3 Approximate Confidence Interval
Page 61 and 62: 3.4 Worked Problems The Problems 1.
Page 63 and 64: Student Questions 1. Let X1, X2, .
Page 65 and 66: Chapter 4 The Theory of Hypothesis
Page 67 and 68: Example 4.3 (The power function). S
Page 69 and 70: (d) Suppose Θ0 = {(µ, σ) : −
Page 71 and 72: (c) Θ0 = {(µ, µ,σ) : −∞ <
Page 73 and 74: The Score Test Statistic: This test
Page 75 and 76: 4.5 The Neyman-Pearson Lemma Suppos
Page 77 and 78: According to the Neyman-Pearson lem
Page 79 and 80: would consider to be an unusually l
Page 81 and 82: pendent we would expect the proport
Page 83 and 84: 3. Explain what is meant by the pow
Page 85 and 86: For the null hypothesis θ = 1, the
Page 87 and 88: 0.1820. Similarly P (X = 3) = 0.218
Page 89 and 90: Appendix A Review of Probability A.
Page 91 and 92: where Xi ∼ Bernoulli(θ) for i =
Page 93 and 94: The expected value of the jth momen
Page 95 and 96: A.3 Continuous Random Variables A.3
Page 97 and 98:
A.3.4 Gaussian Distribution A rando
Page 99 and 100:
A.3.5 Weibull Distribution The Weib
Page 101 and 102:
Next, let g(y) be a monotone decrea
Page 103 and 104:
That is, X + Y ∼ Pois(θ + λ). =
Page 105 and 106:
since � ∞ −∞ e−αu2 du =
Page 107 and 108:
A.4.4 The Bivariate Normal Distribu
Page 109 and 110:
A.5 Generating Functions Denote the
Page 111 and 112:
Density p.g.f. m.g.f. ch.f. c.g.f B
Page 113:
Uniform U(a, b) Continuous Distribu
show all

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

Create successful ePaper yourself

Delete template?

Save as template?