05.12.2012 Views

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

Student Notes To Accompany MS4214: STATISTICAL INFERENCE

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.8 Optimality Properties of the MLE<br />

Suppose that an experiment consists of measuring random variables x1, x2, . . . , xn which<br />

are iid with probability distribution depending on a parameter θ. Let ˆ θ be the MLE<br />

of θ. Define<br />

W1 = � E[I(θ)]( ˆ θ − θ)<br />

W2 = � I(θ)( ˆ θ − θ)<br />

�<br />

W3 = E[I( ˆ θ)]( ˆ θ − θ)<br />

�<br />

W4 = I( ˆ θ)( ˆ θ − θ).<br />

Then, W1, W2, W3, and W4 are all random variables and, as n → ∞, the probabilistic<br />

behaviour of each of W1, W2, W3, and W4 is well approximated by that of a N(0, 1)<br />

random variable. Then, since E[W1] ≈ 0, we have that E[ ˆ θ] ≈ θ and so ˆ θ is approx-<br />

imately unbiased. Also Var[W1] ≈ 1 implies that Var[ ˆ θ] ≈ (E[I(θ)]) −1 and so ˆ θ is<br />

approximately efficient.<br />

Let the data X have probability distribution g(X; θ) where θ = (θ1, θ2, . . . , θm) is a<br />

vector of m unknown parameters. Let I(θ) be the m×m information matrix as defined<br />

above and let E[I(θ)] be the m × m matrix obtained by replacing the elements of I(θ)<br />

by their expected values. Let ˆ θ be the MLE of θ. Let CRLBr be the rth diagonal<br />

element of [E[I(θ)]] −1 . For r = 1, 2, . . . , m, define W1r = ( ˆ θr − θr)/ √ CRLBr. Then, as<br />

n → ∞, W1r behaves like a standard normal random variable.<br />

Suppose we define W2r by replacing CRLBr by the rth diagonal element of the<br />

matrix [I(θ)] −1 , W3r by replacing CRLBr by the rth diagonal element of the matrix<br />

[EI( ˆ θ)] −1 and W4r by replacing CRLBr by the rth diagonal element of the matrix<br />

[I( ˆ θ)] −1 . Then it can be shown that as n → ∞, W2r, W3r, and W4r all behave like<br />

standard normal random variables.<br />

2.9 Data Reduction<br />

Definition 2.11 (Sufficiency). Consider a statistic T = t(X) that summarises the<br />

data so that no information about θ is lost. Then we call t(X) a sufficient statistic. �<br />

Example 2.12. T = t(X) = ¯ X is sufficient for µ when Xi ∼ iid N(µ, σ 2 ). �<br />

<strong>To</strong> better understand the motivation behind the concept of sufficiency consider<br />

three independent Binomial trials where θ = P (X = 1).<br />

39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!