13.07.2015 Views

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Flora et al.Factor analysis assumptionsTable 10 | Product-moment <strong>and</strong> polychoric correlations among Case 4 item-level variables.Variable 1 2 3 4 5 6 7 8 9WrdMean 1 0.81 0.75 0.42 0.47 0.41 0.21 0.21 0.29SntComp 0.52 1 0.62 0.49 0.44 0.55 0.22 0.34 0.13OddWrds 0.66 0.42 1 0.54 0.47 0.53 0.36 0.42 0.50MxdArit 0.31 0.40 0.38 1 0.82 0.86 0.32 0.25 0.27Remndrs 0.43 0.31 0.41 0.54 1 0.76 0.41 0.36 0.42MissNum 0.32 0.46 0.38 0.79 0.51 1 0.35 0.30 0.33Gloves 0.20 0.14 0.33 0.25 0.31 0.28 1 0.35 0.51Boots 0.15 0.29 0.29 0.18 0.24 0.25 0.27 1 0.68Hatchts 0.29 0.12 0.46 0.19 0.38 0.27 0.42 0.45 1N = 100. Product-moment correlations are below <strong>the</strong> diagonal; polychoric correlations are above <strong>the</strong> diagonal. WrdMean, word meaning; SntComp, sentencecompletion; OddWrds, odd words; MxdArit, mixed arithmetic; Remndrs, remainders; MissNum, missing numbers; Hatchts, hatchets.Table 11 | Two-factor model loading matrix obtained with EFA ofproduct-moment R among Case 3 item-level variables.Table 12 | Two-factor model loading matrix obtained with EFA ofproduct-moment R among Case 4 item-level variables.VariableFactorVariableFactorN = 100 N = 200N = 100 N = 200η 1 η 2 η 1 η 2η 1 η 2 η 1 η 2WrdMean 0.53 0.02 0.60 0.00SntComp 0.12 0.37 0.24 0.22OddWrds 0.69 0.04 0.75 −0.06MxdArit −0.13 0.95 −0.01 0.76Remndrs 0.42 0.11 0.37 0.13MissNum 0.12 0.62 −0.01 0.87Gloves 0.44 0.06 0.35 0.07Boots 0.25 0.03 0.32 −0.04Hatchts 0.78 −0.19 0.43 −0.03WrdMean, word meaning; SntComp, sentence completion; OddWrds, odd words;MxdArit, mixed arithmetic; Remndrs, remainders; MissNum, missing numbers;Hatchts, hatchets. Primary loadings for each observed variable are in bold.WrdMean 0.49 0.21 0.46 0.27SntComp 0.23 0.40 0.55 0.12OddWrds 0.68 0.14 0.46 0.36MxdArit −0.11 0.96 0.87 −0.17Remndrs 0.34 0.42 0.58 0.14MissNum 0.02 0.84 0.88 −0.13Gloves 0.47 0.04 0.06 0.45Boots 0.49 −0.01 0.00 0.56Hatchts 0.77 −0.15 −0.06 0.79WrdMean, word meaning; SntComp, sentence completion; OddWrds, odd words;MxdArit, mixed arithmetic; Remndrs, remainders; MissNum, missing numbers;Hatchts, hatchets. Primary loadings for each observed variable are in bold.polychoric correlations among items. Hence, <strong>the</strong>se methods arereferred to as “limited-information” because <strong>the</strong>y collapse <strong>the</strong>complete multi-way frequency <strong>data</strong> into univariate <strong>and</strong> bivariatemarginal information. Full-information factor analysis is an activearea of methodological research, but <strong>the</strong> limited-informationmethod has become quite popular in applied settings <strong>and</strong> simulationstudies indicate that it performs well across a range ofsituations (e.g., Flora <strong>and</strong> Curran, 2004; Forero et al., 2009; alsosee Forero <strong>and</strong> Maydeu-Olivares, 2009, for a study comparing full<strong>and</strong>limited-information modeling). Thus, our remaining presentationfocuses on <strong>the</strong> limited-information, polychoric correlationapproach.The key idea to this approach is <strong>the</strong> assumption that anunobserved, normally distributed continuous variable, y ∗ , underlieseach categorical, ordinally scaled observed variable, y withresponse categories c = 0, 1,..., C. The latent y ∗ links to <strong>the</strong>observed y according toy = cif τ c−1 < y ∗ < τ c ,where each τ is a threshold parameter (i.e., a Z-score) determinedfrom <strong>the</strong> univariate proportions of y with τ 0 =−∞<strong>and</strong> τ C =∞.Adopting Eq. 1, <strong>the</strong> common factor model is <strong>the</strong>n a model for <strong>the</strong>y ∗ variables <strong>the</strong>mselves 11 ,y ∗ = Λη + ε.Like factor analysis of continuous variables, <strong>the</strong> model parametersare actually estimated from <strong>the</strong> correlation structure,where <strong>the</strong>correlations among <strong>the</strong> y ∗ variables are polychoric correlations 12 .Thus, <strong>the</strong> polychoric correlation between two observed, ordinal y11 The factor model can be equivalently conceptualized as a probit regression for<strong>the</strong> categorical dependent variables. Probit regression is nearly identical to logisticregression, but <strong>the</strong> normal cumulative distribution function is used in place of <strong>the</strong>logistic distribution (see Fox, 2008).12 Analogous to <strong>the</strong> incorporation of mean structure among continuous variables,advanced models such as multiple-group measurement models are fitted to <strong>the</strong> setof both estimated thresholds <strong>and</strong> polychoric correlations.<strong>Frontiers</strong> in Psychology | Quantitative Psychology <strong>and</strong> Measurement March 2012 | Volume 3 | Article 55 | 116

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!