3 years ago

First Draft of the paper - University of Toronto

First Draft of the paper - University of Toronto

to a quantity different

to a quantity different from γ 2 = 0 as the sample size increases, with thep-value of the standard test going to zero and the Type I error rate going toone.1.1 Almost sure disasterEveryone knows that measurement error makes the ordinary least squaresestimators asymptotically biased, and the case of two independent variableshas been thoroughly studied – most notably by Cochran (1968). The followingresults are included for completeness, and also because they apply toModel (2), which is a bit more general than usual. First, we obtain a rewardfor confronting the messy formula for ̂β 2 in non-matrix form.(ΣX i,1 ) 2 ΣX i,2 Y i +n(ΣX i,1 X i,2 ΣX i,1 Y i −ΣX 2 i,1 ΣX i,2Y i )+ΣX 2 i,1 ΣX i,2ΣY i −ΣX i,1 (ΣX i,1 Y i ΣX i,2 +ΣX i,1 X i,2 ΣY i )n(ΣX i,1 X i,2 ) 2 −2ΣX i,1 ΣX i,1 X i,2 ΣX i,2 +ΣX 2 i,1 (ΣX i,2) 2 +(ΣX i,1 ) 2 ΣX 2 i,2 −nΣX2 i,1 ΣX2 i,2Dividing numerator and denominator by n 3 and letting n tend to infinity,the Strong Law of Large Numbers gives us almost sure convergence of ̂β 2 to(E[X 1 ]) 2 E[X 2 Y ]+n(E[X 1 X 2 ]E[X 1 Y ]−E[X 2 1 ]E[X 2Y ])+E[X 2 1 ]E[X 2]E[Y ]−E[X 1 ](E[X 1 Y ]E[X 2 ]+E[X 1 X 2 ]E[Y ])n(E[X 1 X 2 ]) 2 −2E[X 1 ]E[X 1 X 2 ]E[X 2 ]+E[X 2 1 ](E[X 2]) 2 +(E[X 1 ]) 2 E[X 2 2 ]−nE[X2 1 ]E[X2 2 ] .Our focus is upon Type I error for the present, so we examine the case whereH 0 : γ 2 = 0 is true. Substituting for the moments in terms of the parametersof Model (2) with γ 2 = 0 and simplifying, we find that as n tends to infinity,̂β 2a.s.→γ 1 (φ 12 θ 11 − φ 11 θ 12 )(φ 11 + θ 11 )(φ 22 + θ 22 ) − (φ 12 + θ 12 ) 2 (5)Expression (5) is the asymptotic bias of ̂β 2 as an estimate of the trueregression parameter γ 2 , in the case where γ 2 = 0. Notice that it does notdepend upon the intercept α, the measurement bias terms ν 1 and ν 2 , norupon κ 1 and κ 2 , the expected values of the latent independent variables.Clearly, the bias is zero only if γ 1 = 0 (the dependent variable is unrelatedto ξ 1 ) or if φ 12 θ 11 = φ 11 θ 12 . Notice the parallel roles played by φ 12 , thecovariance between the latent “true” independent variables, and θ 12 , thecovariance between error terms. If they have opposite signs they pull in10

the same direction, but if they have the same sign they can partially oreven completely offset one another. The effect of φ 12 is augmented by thevariance of the error in measuring ξ 1 , while the effect of θ 12 is augmented bythe variance of ξ 1 itself.The parameter θ 11 , the variance of the error term δ 1 , represents theamount of noise in the independent variable for which one is trying to control,while θ 22 is the amount of noise in the independent variable one is trying totest. Clearly, θ 11 is a greater potential problem, because θ 22 appears only inthe denominator; measurement error in the variable for which one is testingactually decreases the asymptotic bias, in this case where γ 2 = 0. Incidentally,the denominator of (5) is the determinant of the covariance matrix ofX 1 and X 2 ; it will be positive provided that at least one of Φ and Θ arepositive definite.All these details aside, the main point is that when Y is conditionallyindependent of ξ 2 , the estimator ̂β 2 converges to a quantity that is not zeroin general. Now, ̂β 2 is the numerator of the t-statistic commonly used to testH 0 : β 2 = 0 as a substitute for the real null hypothesis H 0 : γ 2 = 0. Thedenominator, the estimated standard deviation of ̂β 2 , may be written asS bβ2 = W n√ n.Using the same approach that led us to (5), it turns out that W n convergesalmost surely to a finite constant, again provided that at least one of thecovariance matrices Φ and Θ are positive definite. Consequently, the absolutevalue of the t-statistic blows up to infinity, and the associated p-valueconverges almost surely to zero. That is, we almost surely commit a Type Ierror.1.2 A simulation study of Type I error inflationIn Section 1.1, we see that if the independent variable for which one is tryingto “control” is measured with error but we ignore it and use mainstreamregression methods, we will commit a Type I error virtually always, if thesample size is large enough. But how large is large enough, and under whatcircumstances? How much might Type I error be inflated in practice? Toaddress these questions, we conducted a large-scale simulation study in whichwe simulated data sets from Model (2) using various sample sizes, probabilitydistributions and parameter values.11

draft - Toronto and Region Conservation Authority
draft - Toronto and Region Conservation Authority
Draft Report: America's Children and the Environment: A First - Inches
The Infant with a Cough A case - CEPD University of Toronto
HIV, HCV and STI infection in Canada - University of Toronto
PDF Format, Slides - University of Toronto
Research in Action 2008 - University of Toronto
CCAP transport NAMAs paper FINAL DRAFT - India Environment ...
In Praise of Weakness - Department of Physics - University of Toronto
SCIENTIFIC ACTIVITIES - Fields Institute - University of Toronto
WEMPA working paper-01 - VU University, Institute for ...
EJMiller_Workshop_Nov-25-10 - Cities Centre - University of Toronto
Chapter 2 - Memorial University of Newfoundland
Sharp Spectral Asymptotics - Victor Ivrii - University of Toronto
Paper Title - Civil Engineering - University of Toronto
Inference for bounded parameters - University of Toronto
Theoretical Statistics and Asymptotics - University of Toronto
McMaster University - University of Toronto
Paper - University of Toronto Dynamic Graphics Project
Likelihood inference for complex data - University of Toronto
Summer 2008 exam (with partial solutions) - University of Toronto