13.07.2015 Views

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Flora et al.Factor analysis assumptionsvariables is an estimate of <strong>the</strong> correlation between two unobserved,continuous y ∗ variables (Olsson, 1979) 13 . Adopting Eq. 3 leads toP ∗ = ΛΨΛ ′ + Θ,where P ∗ is <strong>the</strong> population correlation matrix among y ∗ , whichis estimated with <strong>the</strong> polychoric correlations. Next, because MLestimation provides model estimates that most likely would haveproduced observed multivariate normal <strong>data</strong>, methodologists donot recommend simply substituting polychoric R for productmomentR <strong>and</strong> proceed with ML estimation (e.g., Yang-Wallentinet al., 2010). Instead, we recommend ULS estimation, where<strong>the</strong> purpose is to minimize squared residual polychoric correlations(Muthén, 1978) 14 . The normality assumption for eachunobserved y ∗ is a ma<strong>the</strong>matical convenience that allows estimationof P ∗ ; Flora <strong>and</strong> Curran (2004) showed that polychoriccorrelations remain reasonably accurate under moderate violationof this assumption, which <strong>the</strong>n has only a very small effecton factor analysis results. Finally, this polychoric approach isimplemented in several popular SEM software packages (as wellas R, SAS, <strong>and</strong> Stata), each of which is capable of both EFA<strong>and</strong> CFA.EXAMPLE DEMONSTRATION CONTINUEDTo demonstrate limited-information item factor analysis, we conductedEFA of polychoric correlation matrices among <strong>the</strong> samecategorized variables for Cases 1–4 presented above, again using13 The tetrachoric correlation is a specific type of polychoric correlation that obtainswhen both observed variables are dichotomous.14 Alternatively,“robust weighted least-squares,” also known as “diagonally weightedleast-squares” (DWLS or WLSMV), is also commonly recommended. Simulationstudies suggest that ULS <strong>and</strong> DWLS produce very similar results, with slightly moreaccurate estimates obtained with ULS (Forero et al., 2009; Yang-Wallentin et al.,2010).ULS estimation <strong>and</strong> quartimin rotation. We begin with analysesof <strong>the</strong> approximately symmetric Case 1 <strong>and</strong> Case 2 items. First, <strong>the</strong>sample polychoric correlations are closer to <strong>the</strong> known populationcorrelations than were <strong>the</strong> product-moment correlations among<strong>the</strong>se categorical variables (see Tables 5 <strong>and</strong> 7). For example,<strong>the</strong> population correlation between Word Meaning <strong>and</strong> SentenceCompletion is 0.75 <strong>and</strong> <strong>the</strong> sample polychoric correlation between<strong>the</strong>se two Case 1 items is 0.81,but <strong>the</strong> product-moment correlationwas only 0.60 (both with N = 100). As with product-moment R,<strong>the</strong> EFA of polychoric R among Case 1 variables also strongly suggestedretaining a three-factor model at both N = 100 <strong>and</strong> N = 200;<strong>the</strong> rotated factor-loading matrices are in Table 14. Here, <strong>the</strong> loadingof each observed variable on its primary factor is consistentlylarger than that obtained with <strong>the</strong> EFA of product-moment R(compare to Table 6) <strong>and</strong> much closer to <strong>the</strong> population factorloadings in Table 2. Next, with Case 2 variables, <strong>the</strong> EFA ofpolychoric R again led to a three-factor model at both N = 100<strong>and</strong> N = 200. The primary factor loadings in Table 15 were onlyslightly larger than those obtained with product-moment R (inTable 8), which were <strong>the</strong>mselves reasonably close to <strong>the</strong> populationfactor loadings. Thus, in a situation where strong attenuationof product-moment correlations led to strong underestimationof factor loadings obtained from EFA of product-moment R (i.e.,Case 1), an alternate EFA of polychoric R produced relatively accuratefactor loadings. Yet, when attenuation of product-momentcorrelations is less severe (i.e., Case 2), EFA of polychoric R wasstill just as accurate.Recall that in Case 3 <strong>and</strong> 4, some items were positively skewed<strong>and</strong> some were negatively skewed. First, <strong>the</strong> polychoric correlationsamong Case 3 <strong>and</strong> 4 items are generally closer to <strong>the</strong>population correlations than were <strong>the</strong> product-moment correlations,although many of <strong>the</strong> polychoric correlations among Case3 dichotomous items are quite inaccurate (see Tables 9 <strong>and</strong> 10).For example, <strong>the</strong> population correlation between Word MeaningTable 13 | Three-factor model loading matrix obtained with EFA ofproduct-moment R among Case 4 item-level variables.Table 14 | Three-factor model loading matrix obtained with EFA ofpolychoric R among Case 1 item-level variables.VariableFactorVariableFactorN = 100 N = 200N = 100 N = 200η 1 η 2 η 3 η 1 η 2 η 3η 1 η 2 η 3 η 1 η 2 η 3WrdMean 1.05 −0.10 −0.04 0.95 −0.10 −0.05SntComp 0.43 0.30 −0.05 0.50 0.23 −0.03OddWrds 0.54 0.05 0.32 0.68 0.04 0.14MxdArit 0.17 0.92 −0.04 0.16 0.94 −0.03Remndrs 0.17 0.40 0.25 0.16 0.46 0.17MissNum −0.02 0.87 0.06 0.03 0.85 0.00Gloves 0.00 0.10 0.47 0.00 0.06 0.46Boots 0.01 0.05 0.48 0.02 0.00 0.54Hatchts −0.01 −0.14 0.95 −0.03 −0.08 0.92WrdMean, word meaning; SntComp, sentence completion; OddWrds, odd words;MxdArit, mixed arithmetic; Remndrs, remainders; MissNum, missing numbers;Hatchts, hatchets. Primary loadings for each observed variable are in bold.WrdMean 0.95 −0.07 −0.05 0.94 −0.11 −0.02SntComp 0.85 0.14 −0.09 0.82 0.19 −0.05OddWrds 0.80 0.02 0.24 0.81 0.03 0.14MxdArit −0.02 0.91 0.04 −0.02 0.93 −0.02Remndrs −0.04 0.93 0.06 −0.05 0.86 0.16MissNum 0.12 0.82 −0.12 0.12 0.87 −0.10Gloves −0.01 0.22 0.51 0.04 0.03 0.59Boots 0.10 0.12 0.68 0.02 0.03 0.72Hatchts −0.01 −0.04 0.96 0.02 −0.01 0.92WrdMean, word meaning; SntComp, sentence completion; OddWrds, odd words;MxdArit, mixed arithmetic; Remndrs, remainders; MissNum, missing numbers;Hatchts, hatchets. Primary loadings for each observed variable are in bold.www.frontiersin.org March 2012 | Volume 3 | Article 55 | 117

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!