- Text
- Variables,
- Measurement,
- Regression,
- Parameter,
- Latent,
- Variable,
- Wald,
- Models,
- Likelihood,
- Squares,
- Draft,
- Toronto

First Draft of the paper - University of Toronto

to a quantity different from γ 2 = 0 as **the** sample size increases, with **the**p-value **of** **the** standard test going to zero and **the** Type I error rate going toone.1.1 Almost sure disasterEveryone knows that measurement error makes **the** ordinary least squaresestimators asymptotically biased, and **the** case **of** two independent variableshas been thoroughly studied – most notably by Cochran (1968). The followingresults are included for completeness, and also because **the**y apply toModel (2), which is a bit more general than usual. **First**, we obtain a rewardfor confronting **the** messy formula for ̂β 2 in non-matrix form.(ΣX i,1 ) 2 ΣX i,2 Y i +n(ΣX i,1 X i,2 ΣX i,1 Y i −ΣX 2 i,1 ΣX i,2Y i )+ΣX 2 i,1 ΣX i,2ΣY i −ΣX i,1 (ΣX i,1 Y i ΣX i,2 +ΣX i,1 X i,2 ΣY i )n(ΣX i,1 X i,2 ) 2 −2ΣX i,1 ΣX i,1 X i,2 ΣX i,2 +ΣX 2 i,1 (ΣX i,2) 2 +(ΣX i,1 ) 2 ΣX 2 i,2 −nΣX2 i,1 ΣX2 i,2Dividing numerator and denominator by n 3 and letting n tend to infinity,**the** Strong Law **of** Large Numbers gives us almost sure convergence **of** ̂β 2 to(E[X 1 ]) 2 E[X 2 Y ]+n(E[X 1 X 2 ]E[X 1 Y ]−E[X 2 1 ]E[X 2Y ])+E[X 2 1 ]E[X 2]E[Y ]−E[X 1 ](E[X 1 Y ]E[X 2 ]+E[X 1 X 2 ]E[Y ])n(E[X 1 X 2 ]) 2 −2E[X 1 ]E[X 1 X 2 ]E[X 2 ]+E[X 2 1 ](E[X 2]) 2 +(E[X 1 ]) 2 E[X 2 2 ]−nE[X2 1 ]E[X2 2 ] .Our focus is upon Type I error for **the** present, so we examine **the** case whereH 0 : γ 2 = 0 is true. Substituting for **the** moments in terms **of** **the** parameters**of** Model (2) with γ 2 = 0 and simplifying, we find that as n tends to infinity,̂β 2a.s.→γ 1 (φ 12 θ 11 − φ 11 θ 12 )(φ 11 + θ 11 )(φ 22 + θ 22 ) − (φ 12 + θ 12 ) 2 (5)Expression (5) is **the** asymptotic bias **of** ̂β 2 as an estimate **of** **the** trueregression parameter γ 2 , in **the** case where γ 2 = 0. Notice that it does notdepend upon **the** intercept α, **the** measurement bias terms ν 1 and ν 2 , norupon κ 1 and κ 2 , **the** expected values **of** **the** latent independent variables.Clearly, **the** bias is zero only if γ 1 = 0 (**the** dependent variable is unrelatedto ξ 1 ) or if φ 12 θ 11 = φ 11 θ 12 . Notice **the** parallel roles played by φ 12 , **the**covariance between **the** latent “true” independent variables, and θ 12 , **the**covariance between error terms. If **the**y have opposite signs **the**y pull in10

**the** same direction, but if **the**y have **the** same sign **the**y can partially oreven completely **of**fset one ano**the**r. The effect **of** φ 12 is augmented by **the**variance **of** **the** error in measuring ξ 1 , while **the** effect **of** θ 12 is augmented by**the** variance **of** ξ 1 itself.The parameter θ 11 , **the** variance **of** **the** error term δ 1 , represents **the**amount **of** noise in **the** independent variable for which one is trying to control,while θ 22 is **the** amount **of** noise in **the** independent variable one is trying totest. Clearly, θ 11 is a greater potential problem, because θ 22 appears only in**the** denominator; measurement error in **the** variable for which one is testingactually decreases **the** asymptotic bias, in this case where γ 2 = 0. Incidentally,**the** denominator **of** (5) is **the** determinant **of** **the** covariance matrix **of**X 1 and X 2 ; it will be positive provided that at least one **of** Φ and Θ arepositive definite.All **the**se details aside, **the** main point is that when Y is conditionallyindependent **of** ξ 2 , **the** estimator ̂β 2 converges to a quantity that is not zeroin general. Now, ̂β 2 is **the** numerator **of** **the** t-statistic commonly used to testH 0 : β 2 = 0 as a substitute for **the** real null hypo**the**sis H 0 : γ 2 = 0. Thedenominator, **the** estimated standard deviation **of** ̂β 2 , may be written asS bβ2 = W n√ n.Using **the** same approach that led us to (5), it turns out that W n convergesalmost surely to a finite constant, again provided that at least one **of** **the**covariance matrices Φ and Θ are positive definite. Consequently, **the** absolutevalue **of** **the** t-statistic blows up to infinity, and **the** associated p-valueconverges almost surely to zero. That is, we almost surely commit a Type Ierror.1.2 A simulation study **of** Type I error inflationIn Section 1.1, we see that if **the** independent variable for which one is tryingto “control” is measured with error but we ignore it and use mainstreamregression methods, we will commit a Type I error virtually always, if **the**sample size is large enough. But how large is large enough, and under whatcircumstances? How much might Type I error be inflated in practice? Toaddress **the**se questions, we conducted a large-scale simulation study in whichwe simulated data sets from Model (2) using various sample sizes, probabilitydistributions and parameter values.11

- Page 1 and 2: Inflation of Type I error in multip
- Page 3 and 4: But if the independent variables ar
- Page 5 and 6: sion coefficients are different fro
- Page 7 and 8: and the model is not formally ident
- Page 9: X i,1 = ν 1 + ξ i,1 + δ i,1X i,2
- Page 13 and 14: Thus we may manipulate the reliabil
- Page 15 and 16: 1.2.2 ResultsAgain, this is a compl
- Page 17 and 18: marized in Table 1.2.2, which shows
- Page 19 and 20: each value of γ 2 . For each data
- Page 21 and 22: estimation of it is a possibility.
- Page 23 and 24: Γ is an m × p matrix of unknown c
- Page 25 and 26: giving further thought to model ide
- Page 27 and 28: It is instructive to see how this w
- Page 29 and 30: We emphasize that the simulations r
- Page 31 and 32: For the severe parameter configurat
- Page 33 and 34: In Table 4, using the base distribu
- Page 35 and 36: weighted least squares test for the
- Page 37 and 38: Figure 3: Power of the normal likel
- Page 39 and 40: measurement error, this fits neatly
- Page 41 and 42: We started with two correlated bina
- Page 43 and 44: Well-established solutions are avai
- Page 45 and 46: is that the client has data, and li
- Page 47 and 48: University of Wisconsin, Madison.Be
- Page 49: Robustness in the Analysis of Linea