13.07.2015 Views

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Kasper <strong>and</strong> ÜnlüAssumptions of factor analytic approaches• To what extent does <strong>the</strong> estimation accuracy for factorial structureof <strong>the</strong> classical factor analysis models depend on <strong>the</strong>skewness of <strong>the</strong> population latent ability distribution?• Are <strong>the</strong>re specific aspects of <strong>the</strong> factorial structure or latent abilitydistribution with respect to which <strong>the</strong> classical factor analysismodels are more or less robust in estimation when true abilityvalues are skewed?• Given a skewed population ability distribution does <strong>the</strong> estimationaccuracy for factorial structure of <strong>the</strong> classical factoranalysis models depend on <strong>the</strong> extraction criterion applied fordetermining <strong>the</strong> number of factors from <strong>the</strong> <strong>data</strong>?• Can person ability scores estimated under classical factor analyticapproaches be representative of <strong>the</strong> true ability distributionor properties <strong>the</strong>reof when this distribution is skewed?Mattson (1997)’s method can be used for specifying <strong>the</strong> parametersettings for <strong>the</strong> simulation study (cf. Section 4.2). We brieflydescribe this method (for details, see Mattson, 1997). Assume <strong>the</strong>st<strong>and</strong>ardized manifest variables are expressed as z = Aν, where ν is<strong>the</strong> vector of latent variables <strong>and</strong> A is <strong>the</strong> matrix of model parameters.Moreover, assume that ν = Tω, where T is a lower triangularsquare matrix such that each component of ν is a linear combinationof at most two components of ω, E(vv ′ ) = ν = TT ′ , <strong>and</strong> ω isa vector of mutually independent st<strong>and</strong>ardized r<strong>and</strong>om variablesω i with finite central moments µ 1i , µ 2i , µ 3i , <strong>and</strong> µ 4i , of order upto four. ThenE(z) = AT E(ω) = 0<strong>and</strong>E(zz ′ )(= A ν A ′ ) = AT E(ωω ′ )T ′ A ′ = ATT ′ A ′ .<strong>the</strong>n T <strong>and</strong> ω satisfy <strong>the</strong> required assumptions afore mentioned.Hence <strong>the</strong> skewness <strong>and</strong> kurtosis of any z i are given by, respectively,√β1i =β 2i =∑ k+pm=1 a3 im µ 3m[a ′ i a i] 3/2 <strong>and</strong>∑ k+pm=1 a4 im µ 4m + 6 ∑ k+pm=2∑ m−1o=1 a2 im a2 io[a ′ i a i] 2 .Mattson’s method is used to specify such settings for <strong>the</strong> simulationstudy as <strong>the</strong>y may be observed in large scale assessment<strong>data</strong>. The next section describes this in detail.4.2. DESIGN OF THE SIMULATION STUDYThe number of manifest variables was fixed to p = 24 throughout<strong>the</strong> simulation study. For <strong>the</strong> number of factors, we usednumbers typically found in large scale assessment studies such as<strong>the</strong> Progress in International Reading Literacy Study (PIRLS, e.g.,Mullis et al., 2006) or PISA (e.g., OECD, 2005). According to <strong>the</strong>assessment framework of PIRLS 2006 <strong>the</strong> number of dimensionsfor reading literacy was four, in PISA 2003 <strong>the</strong> scaling model hadseven dimensions. We decided to use a simple loading structurefor L, in <strong>the</strong> sense that every manifest variable was assumed to loadon only one factor (within-item unidimensionality) <strong>and</strong> that eachfactor was measured by <strong>the</strong> same number of manifest variables. Inreliance on PIRLS <strong>and</strong> PISA in our simulation study, <strong>the</strong> numbersof factors were assumed to be four or eight. We assumed that someof <strong>the</strong> factors were well explained by <strong>the</strong>ir indicators while o<strong>the</strong>rswere not, with upper rows (variables) of <strong>the</strong> loading matrix generallyhaving higher factor loadings than lower rows (variables).Thus, <strong>the</strong> loading matrices employed in our study for <strong>the</strong> four <strong>and</strong>eight dimensional simulation models were, respectively,Or equivalently, E(z i z j ) = γ ′ i γ j, where γ i = (a ′ i T)′ <strong>and</strong> a ′ iis <strong>the</strong> i-th row of A. Under <strong>the</strong>se conditions <strong>the</strong> third <strong>and</strong> fourthorder central moments of z i are given byE(z 3 i ) = ∑ mE(z 4 i ) = ∑ mγ 3im µ 3m<strong>and</strong>γ 4im µ 4m + 6 ∑ m2m−1∑o=1γ 2im γ 2io .Hence <strong>the</strong> univariate skewness √ β 1i <strong>and</strong> kurtosis β 2i of any z ican be calculated by√β1i =E ( zi3 )[ ( )]E z2 3/2<strong>and</strong> β 2i = E ( z 4 )i[ ( )]iE z2 2.iIn <strong>the</strong> simulation study, <strong>the</strong> exploratory factor analysis modelwith orthogonal factors (cov( f , f ) = I ) <strong>and</strong> error variablesassumed to be uncorrelated <strong>and</strong> unit normal (with st<strong>and</strong>ardizedmanifest variables) is used as <strong>the</strong> <strong>data</strong> generating model. LetA: = (L, I p ) be <strong>the</strong> concatenated matrix of dimension p × (k + p),where I p is <strong>the</strong> unit matrix of order p × p, <strong>and</strong> let v : = ( f ′ , e ′ ) ′be <strong>the</strong> concatenated vector of length k + p. Then we have z = Avfor <strong>the</strong> simulation factor model. Let T: = I (k+p)×(k+p) <strong>and</strong> ω: = ν,⎛⎞0.9 0 0 00.8 0 0 00.7 0 0 00.6 0 0 00.5 0 0 00.4 0 0 00 0.8 0 00 0.7 0 00 0.6 0 00 0.5 0 00 0.4 0 0L =0 0.3 0 00 0 0.6 00 0 0.6 00 0 0.5 00 0 0.4 00 0 0.4 00 0 0.3 00 0 0 0.60 0 0 0.50 0 0 0.50 0 0 0.4⎜⎟⎝ 0 0 0 0.3⎠0 0 0 0.3⎛⎞0.9 0 0 0 0 0 0 00.8 0 0 0 0 0 0 00.7 0 0 0 0 0 0 00 0.8 0 0 0 0 0 00 0.8 0 0 0 0 0 00 0.7 0 0 0 0 0 00 0 0.8 0 0 0 0 00 0 0.7 0 0 0 0 00 0 0.6 0 0 0 0 00 0 0 0.7 0 0 0 00 0 0 0.7 0 0 0 0<strong>and</strong> L =0 0 0 0.7 0 0 0 00 0 0 0 0.7 0 0 0.0 0 0 0 0.6 0 0 00 0 0 0 0.6 0 0 00 0 0 0 0 0.6 0 00 0 0 0 0 0.6 0 00 0 0 0 0 0.5 0 00 0 0 0 0 0 0.5 00 0 0 0 0 0 0.4 00 0 0 0 0 0 0.4 00 0 0 0 0 0 0 0.4⎜⎟⎝ 0 0 0 0 0 0 0 0.4⎠0 0 0 0 0 0 0 0.3<strong>Frontiers</strong> in Psychology | Quantitative Psychology <strong>and</strong> Measurement March 2013 | Volume 4 | Article 109 | 127

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!