Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

More documents

Recommendations

Info

OsborneAssumptions and data cleaninggo beyond the basics of data cleaning and testing assumptions—to show that assumptions and quality data are still relevant andimportant in the 21st century. They went above and beyondthis challenge in many interesting—and unexpected ways. I hopethat this is the beginning—or a continuation—of an importantdiscussion that strikes at the very heart of our quantitativedisciplines; namely, whether we can trust any of the resultswe read in journals, and whether we can apply (or generalize)those results beyond the limited scope of the originalsample.REFERENCESBoneau, C. A. (1960). The effects ofviolations of assumptions underlyingthe t test. Psychol. Bull. 57,49–64. doi: 10.1037/h0041412Box, G. (1953). Non-normality andtests on variances. Biometrika40, 318.Feir-Walsh, B., and Toothaker, L.(1974). An empirical comparisonof the ANOVA F-test, normalscores test and Kruskal-Wallis testunder violation of assumptions.Educ.Psychol.Meas.34, 789. doi:10.1177/001316447403400406Havlicek, L. L., and Peterson, N. L.(1977). Effect of the violation ofassumptions upon significance levelsof the Pearson r. Psychol. Bull.84, 373–377. doi: 10.1037/0033-2909.84.2.373Keselman, H. J., Huberty, C. J., Lix,L. M., Olejnik, S., Cribbie, R.A., Donahue, B., et al. (1998).Statistical practices of educationalresearchers: an analysis of theirANOVA, MANOVA, and ANCOVAAnalyses. Rev. Edu. Res. 68, 350–386.doi: 10.3102/00346543068003350Lix, L., Keselman, J., and Keselman, H.(1996). Consequences of assumptionviolations revisited: a quantitativereview of alternatives to theone-way analysis of variance “F”Test. Rev. Educ. Res. 66, 579–619.Maxwell, S., and Delaney, H.(1990). Designing Experimentsand Analyzing Data: a ModelComparison Perspective. PacificGrove, CA: Brooks Cole PublishingCompany.Osborne, J. W. (2008). Sweating thesmall stuff in educational psychology:how effect size and powerreporting failed to change from1969 to 1999, and what that meansfor the future of changing practices.Educ. Psychol. 28, 1–10. doi:10.1080/01443410701491718Osborne, J. W. (2012). Best Practicesin Data Cleaning: A Complete Guideto Everything You Need to DoBefore and After Collecting YourData. Thousand Oaks, CA: SagePublications.Osborne, J. W., Kocher, B., and Tillman,D. (2012). “Sweating the small stuff:do authors in APA journals cleandata or test assumptions (and shouldanyone care if they do),” in Paperpresented at the Annual meetingof the Eastern Education ResearchAssociation, (Hilton Head, SC).Pearson, E. (1931). The analysis of variancein cases of non-normal variation.Biometrika 23, 114.Pearson, K. (1901). Mathematicalcontribution to the theory of evolution.VII: On the correlationof characters not quantitativelymeasurable. Philos. Trans.R. Soc. Lond. B Biol. Sci. 195,1–47.Student. (1908). The probable error ofa mean. Biometrika 6, 1–25.Vardeman, S., and Morris, M. (2003).Statistics and Ethics. Am. Stat. 57,21–26. doi: 10.1198/0003130031072Wilcox, R. (1987). New designs in analysisof variance. Ann. Rev. Psychol.38, 29–60. doi: 10.1146/annurev.ps.38.020187.000333Received: 16 April 2013; accepted: 06June 2013; published online: 25 June2013.Citation: Osborne JW (2013) Is datacleaning and the testing of assumptionsrelevant in the 21st century?Front. Psychol. 4:370. doi: 10.3389/fpsyg.2013.00370This article was submitted to Frontiersin Quantitative Psychology andMeasurement, a specialty of Frontiers inPsychology.Copyright © 2013 Osborne. This isan open-access article distributed underthe terms of the Creative CommonsAttribution License, whichpermitsuse,distribution and reproduction in otherforums, provided the original authorsand source are credited and subject to anycopyright notices concerning any thirdpartygraphics etc.www.frontiersin.org June 2013 | Volume 4 | Article 370 | 7
ORIGINAL RESEARCH ARTICLEpublished: 14 May 2012doi: 10.3389/fpsyg.2012.00137Are assumptions of well-known statistical techniqueschecked, and why (not)?Rink Hoekstra 1,2 *, Henk A. L. Kiers 2 and Addie Johnson 21GION –Institute for Educational Research, University of Groningen, Groningen, The Netherlands2Department of Psychology, University of Groningen, Groningen, The NetherlandsEdited by:Jason W. Osborne, Old DominionUniversity, USAReviewed by:Jason W. Osborne, Old DominionUniversity, USAJelte M. Wicherts, University ofAmsterdam, The Netherlands*Correspondence:Rink Hoekstra, GION, University ofGroningen, Grote Rozenstraat 3,9712 TG Groningen, The Netherlandse-mail: r.hoekstra@rug.nlA valid interpretation of most statistical techniques requires that one or more assumptionsbe met. In published articles, however, little information tends to be reported on whetherthe data satisfy the assumptions underlying the statistical techniques used. This could bedue to self-selection: Only manuscripts with data fulfilling the assumptions are submitted.Another explanation could be that violations of assumptions are rarely checked for inthe first place. We studied whether and how 30 researchers checked fictitious data forviolations of assumptions in their own working environment. Participants were asked toanalyze the data as they would their own data, for which often used and well-known techniquessuch as the t-procedure, ANOVA and regression (or non-parametric alternatives)were required. It was found that the assumptions of the techniques were rarely checked,and that if they were, it was regularly by means of a statistical test. Interviews afterwardrevealed a general lack of knowledge about assumptions, the robustness of the techniqueswith regards to the assumptions, and how (or whether) assumptions should be checked.These data suggest that checking for violations of assumptions is not a well-consideredchoice, and that the use of statistics can be described as opportunistic.Keywords: assumptions, robustness, analyzing data, normality, homogeneityINTRODUCTIONMost statistical techniques require that one or more assumptionsbe met, or, in the case that it has been proven that a technique isrobust against a violation of an assumption, that the assumptionis not violated too extremely. Applying the statistical techniqueswhen assumptions are not met is a serious problem when analyzingdata (Olsen, 2003; Choi, 2005). Violations of assumptionscan seriously influence Type I and Type II errors, and can resultin overestimation or underestimation of the inferential measuresand effect sizes (Osborne and Waters, 2002). Keselman et al.(1998) argue that “The applied researcher who routinely adoptsa traditional procedure without giving thought to its associatedassumptions may unwittingly be filling the literature with nonreplicableresults” (p. 351). Vardeman and Morris (2003) state“...absolutely never use any statistical method without realizingthat you are implicitly making assumptions, and that the validityof your results can never be greater than that of the most questionableof these” (p. 26). According to the sixth edition of the APAPublication Manual, the methods researchers use “...must supporttheir analytic burdens, including robustness to violations ofthe assumptions that underlie them...” [American PsychologicalAssociation (APA, 2009); p. 33]. The Manual does not explicitlystate that researchers should check for possible violations ofassumptions and report whether the assumptions were met, butit seems reasonable to assume that in the case that researchers donot check for violations of assumptions, they should be aware ofthe robustness of the technique.Many articles have been written on the robustness of certaintechniques with respect to violations of assumptions (e.g., Kohrand Games, 1974; Bradley, 1980; Sawilowsky and Blair, 1992;Wilcox and Keselman, 2003; Bathke, 2004), and many ways ofchecking to see if assumptions have been met (as well as solutionsto overcoming problems associated with any violations)have been proposed (e.g., Keselman et al., 2008). Using a statisticaltest is one of the frequently mentioned methods of checkingfor violations of assumptions (for an overview of statisticalmethodology textbooks that directly or indirectly advocate thismethod, see e.g., Hayes and Cai, 2007). However, it has alsobeen argued that it is not appropriate to check assumptions bymeans of tests (such as Levene’s test) carried out before decidingon which statistical analysis technique to use because suchtests compound the probability of making a Type I error (e.g.,Schucany and Ng, 2006). Even if one desires to check whetheror not an assumption is met, two problems stand in the way.First, assumptions are usually about the population, and in asample the population is by definition not known. For example,it is usually not possible to determine the exact variance ofthe population in a sample-based study, and therefore it is alsoimpossible to determine that two population variances are equal,as is required for the assumption of equal variances (also referredto as the assumption of homogeneity of variances) to be satisfied.Second, because assumptions are usually defined in a verystrict way (e.g., all groups have equal variances in the population,or the variable is normally distributed in the population),the assumptions cannot reasonably be expected to be satisfied.Given these complications, researchers can usually only examinewhether assumptions are not violated “too much” in theirsample; for deciding on what is too much, information aboutwww.frontiersin.org May 2012 | Volume 3 | Article 137 | 8
Page 2 and 3: FRONTIERS COPYRIGHTSTATEMENT© Copy
Page 4 and 5: Table of Contents05 Is Data Cleanin
Page 7: OsborneAssumptions and data cleanin
Page 12 and 13: Hoekstra et al.Why assumptions are
Page 18 and 19: ORIGINAL RESEARCH ARTICLEpublished:
Page 20 and 21: García-PérezStatistical conclusio
Page 30 and 31: Sheng and ShengEffect of non-normal
Page 42 and 43: REVIEW ARTICLEpublished: 12 April 2
Page 44 and 45: Nimon et al.The assumption of relia
Page 56 and 57: TressoldiPower replication unreliab
Page 58 and 59:
TressoldiPower replication unreliab
Page 60 and 61:
ORIGINAL RESEARCH ARTICLEpublished:
Page 62 and 63:
FinchModern methods for the detecti
Page 64 and 65:
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
Page 72 and 73:
MINI REVIEW ARTICLEpublished: 28 Au
Page 74 and 75:
NimonStatistical assumptionsand Del
Page 76 and 77:
NimonStatistical assumptionsFor exa
Page 78 and 79:
Kraha et al.Interpreting multiple r
Page 80 and 81:
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
SmithsonComparing moderation of slo
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
REVIEW ARTICLEpublished: 01 March 2
Page 104 and 105:
Flora et al.Factor analysis assumpt
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Kasper and ÜnlüAssumptions of fac
Page 126 and 127:
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Lages and JaworskaHow predictable a
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Cummiskey et al.Testing assumptions
Page 154 and 155:
Page 156 and 157:
Page 158:
show all

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?