- Text
- Variables,
- Measurement,
- Regression,
- Parameter,
- Latent,
- Variable,
- Wald,
- Models,
- Likelihood,
- Squares,
- Draft,
- Toronto

First Draft of the paper - University of Toronto

a valid measurement error regression, using commercially available s**of**tware.Many statisticians may prefer to use SAS proc calis (SAS Institute Inc.,1999) because **the**y probably have access to it already and **the** lineqs syntaxis straightforward. O**the**rs may prefer AMOS because **of** **the** graphical interface,while still o**the**rs (particularly psychometricians) may prefer LISRELbecause **of** its long history and **the** wealth **of** instructional material that isavailable. In any case, **the** hard part is not really with understanding **the**methods or using **the** s**of**tware; it’s with having **the** right kind **of** data.In Section 4, we present a more limited set **of** simulations in which structuralequation models are applied to various simulated data sets that employ**the** test-retest design. The purpose **of** **the**se simulations is not to becomprehensive, but just to provide some practical guidance. We find thatnormal-**the**ory likelihood ratio tests work well even when **the** data are notnormal, and that for smaller samples, **the**y generally protect better againstType I error than ei**the**r Wald tests or tests based on Browne’s (1984) robustweighted least-squares method (though **the**se are far, far better than ignoring**the** measurement error). For substantial amounts **of** measurement error andstrong correlations between **the** independent variables, we found that testsbased on weighted least squares were biased even for n = 1, 000, while **the**normal-**the**ory likelihood ratio test was unbiased; this held for non-normal aswell as normal data.Finally, we ask a rhetorical question. If if ignoring measurement error inregression has such awful consequences, and **the**re is a perfectly satisfactoryalternative, why are we still teaching our students to do it? In consultingsituations, why are we still helping our clients do it? In our view, **the** onlyreason is inertia. It is time for a change.1 Inflation **of** Type I error rateTo see how badly things can go wrong, consider a multiple regression modelin which **the**re are two independent variables, both measured with simple additiveerror. The LISREL-type notation (for example Jöreskog, 1978; Bollen,1989) is employed for compatibility with **the** discussion **of** structural equationmodels later in this **paper**.Independently for i = 1, . . . , n,Y i = α + γ 1 ξ i,1 + γ 2 ξ i,2 + ζ i (2)8

X i,1 = ν 1 + ξ i,1 + δ i,1X i,2 = ν 2 + ξ i,2 + δ i,2 ,where α, γ 1 and γ 2 are unknown constants (regression coefficients), andE[ξi,1ξ i,2]=[κ1κ 2][ ]ξi,1V arξ i,2= Φ =[ ]φ11 φ 12φ 12 φ 22E[δi,1δ i,2]=[ ] 00[ ]δi,1V arδ i,2= Θ =[ ]θ11 θ 12θ 12 θ 22(3)E[ζ i ] = 0 V ar[ζ i ] = ψ.The true independent variables are ξ i,1 and ξ i,2 , but **the**y are latent variablesthat cannot be observed directly. They are independent **of** **the** error term ζ iand **of** **the** measurement errors δ i,1 and δ i,2 ; **the** error term is also independent**of** **the** measurement errors. The constants ν 1 and ν 2 represent measurementbias. For example, if ξ 1 is true average minutes **of** exercise per day and X 1 isreported average minutes **of** exercise, **the**n ν 1 is **the** mean amount by whichpeople exaggerate **the**ir exercise times.Also, it is reasonable to assume that errors **of** measurement may be correlated.Again, suppose that ξ 1 is true amount **of** exercise and X 1 is reportedamount **of** exercise, while ξ 2 is true age and X 2 is reported age. It is naturalto imagine that adults who exaggerate how much **the**y exercise mighttend to under-report **the**ir ages. Thus, **the** covariance parameter θ 12 is quitemeaningful.When a model such as (2) holds, all one can observe are **the** triples(X i,1 , X i,2 , Y i ) for i = 1, . . . , n. Suppose **the** interest is in testing whe**the**rξ 2 is related to Y , conditionally on **the** value **of** ξ 1 . The natural mistake is totake X 1 as a surrogate for ξ 1 and X 2 as a surrogate for ξ 2 , fit **the** modelY i = β 0 + β 1 X i,1 + β 2 X i,2 + ɛ i (4)by ordinary least squares, and test **the** null hypo**the**sis H 0 : β 2 = 0 as asubstitute for H 0 : γ 2 = 0, using **the** usual t or F -test.Suppose that in fact γ 2 = 0, so that conditionally upon **the** value **of** ξ 1 ,**the** dependent variable Y is independent **of** ξ 2 . It turns out that except underspecial circumstances, **the** least squares quantity ̂β 2 converges almost surely9

- Page 1 and 2: Inflation of Type I error in multip
- Page 3 and 4: But if the independent variables ar
- Page 5 and 6: sion coefficients are different fro
- Page 7: and the model is not formally ident
- Page 11 and 12: the same direction, but if they hav
- Page 13 and 14: Thus we may manipulate the reliabil
- Page 15 and 16: 1.2.2 ResultsAgain, this is a compl
- Page 17 and 18: marized in Table 1.2.2, which shows
- Page 19 and 20: each value of γ 2 . For each data
- Page 21 and 22: estimation of it is a possibility.
- Page 23 and 24: Γ is an m × p matrix of unknown c
- Page 25 and 26: giving further thought to model ide
- Page 27 and 28: It is instructive to see how this w
- Page 29 and 30: We emphasize that the simulations r
- Page 31 and 32: For the severe parameter configurat
- Page 33 and 34: In Table 4, using the base distribu
- Page 35 and 36: weighted least squares test for the
- Page 37 and 38: Figure 3: Power of the normal likel
- Page 39 and 40: measurement error, this fits neatly
- Page 41 and 42: We started with two correlated bina
- Page 43 and 44: Well-established solutions are avai
- Page 45 and 46: is that the client has data, and li
- Page 47 and 48: University of Wisconsin, Madison.Be
- Page 49: Robustness in the Analysis of Linea