Variance Estimation for the General Regression Estimator

More documents

Recommendations

Info

6 ⎧ ⎨ ⎪ ⎢ ⎩⎣ ⎪ design variance, ˆ ˆ ( ) ( ) 2 E E ⎡ Y Y E E Y Y ⎤ − − − M π G M π G useful for both var ˆ M ( Y G − Y) and var ˆ π ( Y G) ⎥⎦ ⎫⎪ ⎬. Rather we seek estimators that are ⎪⎭ . The arguments given here are largely heuristic ones used to motivate the forms of the variance estimators. Additional, formal conditions such as those found in Royall and Cumberland (1978) or Yung and Rao (2000) are needed for modelbased and design-based consistency and approximate unbiasedness. First, consider estimation of the approximate model-variance given in (1.5). In the following development, we assume that, as N and n become large, (i) Nmax( ) O( n) i π = and i (ii) A πs N converges to a matrix of constants, A o. A residual associated with sample unit i is r = Y − Yˆ where Y ˆ =xB. ′ ˆ The vector of i i i i i predicted values for the sample units can be written as Yˆs = HY s s (2.1) where 1 1 1 s = s π − s ′ s s − s − =∑ ∈ H XA XV Π . The predicted value for an individual unit is Y ˆi hY ij j j s 1 where hij ′ − i πs j ( vjπ j) =xA x is the (ij) th element of H s. The matrix H s is the analog to the usual hat matrix (Belsley, Kuh, and Welsch 1980) from standard regression analysis. The diagonal elements of the hat matrix are known as leverages and are a measure of the effect that a unit has on its own predicted value. Notice that the inverses of the selection probabilities are involved in (2.1), although these would have no role in purely model-based analysis. The following lemma, which is a variation of some results in Lemma 5.3.1 of (Valliant, Dorfman, and Royall 2000), gives some properties of the leverages and the hat matrix.
7 Lemma 1. Assume that (i) and (ii) hold. For 1 1 1 s = s π − s ′ s s − s − H XA XV Π the following properties hold for all i∈ s: 1 (a) hij = O( n − ) (b) H s is idempotent. (c) 0≤h ii ≤ 1. 1 Proof: Since hij ′ − i πs j ( vjπ j) 1 =xA x , conditions (i) and (ii) imply that hij O( n − ) = . Part (b) follows from direct multiplication, using the definition of H . To prove (c) note that h ≥0 s ii since it is a quadratic form. Part (b) implies that 2 ii ii j≠i ij ji h = h +∑ hh which can hold only if h ≤ 1. ii Next, we write the residual as r = Y ( 1−h ) − ∑ hY where ( ) excluding unit i. Since ( ) 0 M i i i ii ij j j∈si ( ) 2 E r = , we have E ( r ) var ( r ) ( ) ψ ( 1 ) = and M i M i 2 2 2 s i is the sample EM ri = i − hii + ∑ hijψ j (2.2) j∈si ( ) 2 under model (1.4). Using Lemma 1(a), we have h = o( 1) , h o( 1) 2 ( ) E r ≅ ψ . Thus, in large samples, M i i ii ij = , and consequently, 2 r i is an approximately unbiased estimator of the correct model-variance even though the variance specification in model (1.1) was incorrect. As a result,
Page 1 and 2: Variance Estimation for the General
Page 3 and 4: 1 1. Introduction Robust variance e
Page 5 and 6: 3 based and model-based interpretat
Page 7: 5 3.5, for a more detailed descript
Page 11 and 12: 9 When the selection probability of
Page 13 and 14: 11 −1 −1 −1 −1 si si X′ s
Page 15 and 16: 13 than the other variance estimato
Page 17 and 18: 15 hours worked. A constant model-v
Page 19 and 20: 17 use the leverage adjustments but
Page 21 and 22: 19 variance estimates, conditional
Page 23 and 24: 21 ACKNOWLEDGMENT The author is ind
Page 25 and 26: 23 REFERENCES BELSLEY, D.A., KUH, E
Page 27 and 28: 25 STUKEL, D., HIDIROGLOU, M.A., AN
Page 29 and 30: 27 Table 1. Relative biases and roo
Page 31 and 32: 29 Table 3. 95% confidence interval
Page 33 and 34: 31 Table 5. 95% confidence interval
Page 36: srs n = 50 Figure 3 pps n = 50 -10

Variance Estimation for the General Regression Estimator

Create successful ePaper yourself

Delete template?

Save as template?