13.07.2015 Views

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Kasper <strong>and</strong> ÜnlüAssumptions of factor analytic approaches0.2 0.4 0.6 0.8number of extracted factors109876543210987654321098765432PLPCSTPCKGPC0.2 0.4 0.6 0.8PLEASTEAKGEArelative frequencyPLPASTPAKGPA0.2 0.4 0.6 0.8Normal Slightly skewed Strongly skewedFIGURE 2 | Relative frequencies of <strong>the</strong> numbers of extracted factors, for n = 200 <strong>and</strong> k = 4. Factor models are principal component analysis (PCA, or PC),exploratory factor analysis (EFA, or EA), <strong>and</strong> principal axis analysis (PAA, or PA). Kaiser-Guttman criterion (KG), scree test (ST), <strong>and</strong> parallel analysis (PL) serve asfactor extraction criteria.words, strongly negative skewed distributions may not be estimatedwithout bias based on <strong>the</strong> classical factor models. Increasingsample size, for example from n = 200 to 600, or changing <strong>the</strong>number of underlying factors, say from k = 4 to 8, did not alterthis observation considerably. For that reason, <strong>the</strong> correspondingplots at this point of <strong>the</strong> paper are omitted <strong>and</strong> can be found inKasper (2012).We performed Shapiro-Wilk tests for univariate normality of<strong>the</strong> estimated factor scores. As can be seen from Figure 7A, undernormally distributed true latent ability scores nearly all values of Ware statistically non-significant. In <strong>the</strong>se cases, <strong>the</strong> null hypo<strong>the</strong>siscannot be rejected.A similar conclusion can be drawn when <strong>the</strong> true latent abilityvalues are not normally distributed but instead follow a slightlyskewed distribution (Figure 7B). Nearly all Shapiro-Wilk test statisticvalues are statistically non-significant. In o<strong>the</strong>r words, <strong>the</strong>null hypo<strong>the</strong>sis stating normally distributed latent ability valuesis seldom rejected although <strong>the</strong> true latent distribution is skewed<strong>and</strong> not normal. No relationship between <strong>the</strong> p-values <strong>and</strong> <strong>the</strong> usedfactor model or factor position may be apparent (disregarding <strong>the</strong>observation that <strong>the</strong> p-values for <strong>the</strong> fourth factor are generallylower than for <strong>the</strong> o<strong>the</strong>r factors).The case of a strongly skewed factor score distribution isdepicted in Figure 7C. Virtually all values of W are statisticallysignificant <strong>and</strong> <strong>the</strong> null hypo<strong>the</strong>sis of normality of factor scoresis rejected. Similar conclusions or observations may be drawn forincreased sample size or factor space dimension <strong>and</strong> we do omitpresenting plots <strong>the</strong>reof.Finally, Figure 8 shows <strong>the</strong> distribution of <strong>the</strong> estimated factorscores on <strong>the</strong> fourth factor (for k = 4) in comparison to <strong>the</strong> truestrongly skewed ability distribution under <strong>the</strong> exploratory factoranalysis model for a sample size of n = 1,000. The unit normaldistribution is plotted as a reference. The estimated factor scoreshave a skewness value of −0.47 compared to true skewness −2.The estimated distribution deviates from <strong>the</strong> true distribution <strong>and</strong>does not approximate it acceptably well.<strong>Frontiers</strong> in Psychology | Quantitative Psychology <strong>and</strong> Measurement March 2013 | Volume 4 | Article 109 | 131

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!