ST3239: Survey Methodology - The Department of Statistics and ...

More documents

Recommendations

Info

Theorem 2.2.2 E(ȳ) = µ, V ar(ȳ) = σ2 n ( ) N − n . N − 1 Proof. Note ȳ = 1 n (y 1 + ... + y n ). So E(ȳ) = 1 n (Ey 1 + ... + Ey n ) = 1 (nµ) = µ. n Now V ar(ȳ) = 1 n Cov( ∑ n n∑ y 2 i , y j ) = 1 ∑ n n∑ Cov(y i=1 j=1 n 2 i , y j ) i=1 j=1 ⎛ ⎞ = 1 ⎝ ∑ Cov(y n 2 i , y j ) + ∑ Cov(y i , y j ) ⎠ i≠j i=j ⎛ ⎞ = 1 ⎝ ∑ (− σ2 n n 2 i≠j N − 1 ) + ∑ V ar(y i ) ⎠ i=1 = 1 ( ) n(n − 1)(− σ2 n 2 N − 1 ) + nσ2 ( (n − 1)(− 1 = σ2 n = σ2 n ( ) N − n N − 1 N − 1 ) + 1 ) Remark: From Theorem 2.2.2, we see that ȳ is an unbiased estimator for µ. Also as n gets large (but n ≤ N), V ar(ȳ) tends to 0. This implies that ȳ will be a more accurate estimator for µ as n gets larger (but less than N). In particular, when n = N, we have a census and V ar(ȳ) = 0. Remark: In our previous statistics course, we usually sample {y 1 , y 2 , · · · , y n } from the population with replacement. Therefore, {y 1 , y 2 , · · · , y n } are independent and identically distributed (i.i.d.). And recall we have results like E iid (ȳ) = µ, V ar iid (ȳ) = σ2 n . Notice that V ar iid (ȳ) is different from V ar(ȳ) in Theorem 2.2.2. In fact, for n > 1, V ar(ȳ) = σ2 n ( ) N − n < σ2 N − 1 n = V ar iid(ȳ). Thus, for the same sample size n, sampling without replacement produces a less variable estimator of µ. Why? 7
Summary 1. How to draw a simple random sample? (purpose, method) Simple random sampling is the basic survey methodology. 2. After getting a s.r.s, how to describe the population, or how to analyze the data? Estimation of the population mean. (Sample mean.) Estimation of σ 2 and V ar(ȳ) The population variance σ 2 is usually unknown. Now define s 2 = 1 n∑ (y i − ȳ) 2 = 1 ( n ) ∑ yi 2 − n(ȳ) 2 . n − 1 i=1 n − 1 i=1 Example. When a few data points are repeated in a data set, the results are often arrayed in a frequency table. For example, a quiz given to 25 students was graded on a 4-point scale 0, 1, 2, 3 with 3 being a perfect score. Here are the results: Score(X) Frequency(F ) Proportion(P ) 3 16 0.64 2 4 0.16 1 2 0.08 0 3 0.12 (a). Calculate the average score by using frequencies. (b). Calculate the average score by using proportions. (c). Calculate the standard deviation. Solution If the above 25 students constitute a random sample, then s 2 = Let us look at some properties of s 2 . Is it unbiased? n 1.0976 = 1.1433. n−1 Theorem 2.2.3 E(s 2 ) = N N − 1 σ2 . 8
Page 1 and 2: ST3239: Survey Methodology by Wang
Page 3 and 4: 1.3 Why sample? If a sample is equa
Page 5 and 6: Proof. By the definition of s.r.s,
Page 7: 2.2 Estimation of population mean a
Page 11 and 12: Central limit theorem: If n → ∞
Page 13 and 14: e.g. Suppose that a total of 1500 s
Page 15 and 16: 2.3.2 Estimation of population tota
Page 17 and 18: 2.4 Estimation of population propor
Page 19 and 20: Solution 2.5 Comparing estimates Su
Page 21 and 22: Proof. Note that ˆp 1 = X/n and ˆ

ST3239: Survey Methodology - The Department of Statistics and ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?