Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

More documents

Recommendations

Info

Lages and JaworskaHow predictable are voluntary decisions?APPENDIXTable A1 | Replication study 1: left/right key press in spontaneous motor decision task.Subject Resp. bias left/right Bino. test p-value No of Runs Runs test p-value Lilliefors p-value SVM pred acc[1] 21/11 0.1102 11 0.1194 0.0957 [0.5875]2 16/16 1.0 17 1.0 0.1659 0.39733 13/19 0.3771 16 1.0 0.1116 0.2672[4] 11/21 0.1102 15 1.0 0.0017** [0.4970]5 14/18 0.5966 26 0.0009*** 0.4147 0.82186 15/17 0.8601 15 0.6009 0.0893 0.5607[7] 22/10* 0.0501(*) 17 0.4906 >0.5 [0.4259]8 20/12 0.2153 18 0.5647 >0.5 0.6295[9] 10/22* 0.0501(*) 15 1.0 0.001*** [0.2845]10 18/14 0.5966 21 0.1689 0.0936 0.6789[11] 31/1*** 0.0001*** 3 1.0 >0.5 [NA]12 19/13 0.3771 20 0.2516 0.4147 0.579013 20/12 0.2153 22 0.0280(*) 0.3476 0.726714 14/18 0.5966 21 0.1689 0.1659 0.609815 16/16 1.0 21 0.2056 0.1116 0.6330[16] 4/28*** 0.0001*** 5 0.0739 0.0017*** [0.7513]17 17/15 0.8601 22 0.0989 >0.5 0.7026[18] 11/21 0.1102 16 0.9809 0.0893 [0.2985]19 12/20 0.2153 7 0.00092*** 0.001*** 0.7891[20] 11/21 0.1102 16 0.9809 0.0936 [0.2844]Tot/Avg 4.25 # 4 (2) 16.2 3 (2) 4 (4) 0.5539Participant excluded according to Soon et al.’s (2008) response criterion (binomial test p < 0.11); # average deviation: Σ i |x i –n/2|/N for n = 32 and N = 20.*p < 0.05, **p < 0.01, ***p < 0.001; ( ·) number of violations adjusted for multiple tests after Sidak–Dunn.Table A2 | Replication study 2: addition/subtraction in hidden intention task.Subject Resp. bias add/sub Bino. test p-value No of Runs Runs test p-value Lilliefors p-value SVM pred acc1 11/21 0.1102 9 0.0184(*) 0.0112(**) 0.77252 14/18 0.5966 10 0.0215(*) 0.0271(*) 0.71243 11/21 0.1102 8 0.0057(**) 0.1153 0.69914 14/18 0.5966 9 0.0071(**) 0.001*** 0.78475 21/11 0.1102 12 0.2405 0.001*** 0.59326 2/30*** 0.0001*** 5 1.0 >0.5 N/A7 17/15 0.8601 18 0.8438 0.3267 0.60748 20/12 0.2153 12 0.1799 0.0016** 0.59049 6/26*** 0.0005*** 4 0.0006*** 0.4443 0.835110 18/14 0.5966 16 0.9285 0.2313 0.397211 9/23* 0.0201(*) 13 0.8488 0.4030 0.443412 9/23* 0.0201(*) 10 0.1273 >0.5 0.6131Tot/Avg 5.33 # 4 (2) 10.5 5 (1) 5 (3) 0.6408*p < 0.05, **p < 0.01, ***p < 0.001; # average deviation: Σ i |x i –n/2|/N for n = 32 and N = 12. ( ·) Number of violations adjusted for multiple tests after Sidak–Dunn.Frontiers in Psychology | Quantitative Psychology and Measurement March 2012 | Volume 3 | Article 56 | 149
METHODS ARTICLEpublished: 25 September 2012doi: 10.3389/fpsyg.2012.00354Using classroom data to teach students about datacleaning and testing assumptionsKevin Cummiskey 1 *, Shonda Kuiper 2 and Rodney Sturdivant 11Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA2Department of Mathematics and Statistics, Grinnell College, Grinnell, IA, USAEdited by:Jason W. Osborne, Old DominionUniversity, USAReviewed by:Steven E. Stemler, WesleyanUniversity, USAFernando Marmolejo-Ramos,University of Adelaide, AustraliaMartin Dempster, Queen’s UniversityBelfast, UK*Correspondence:Kevin Cummiskey, Department ofMathematical Sciences, UnitedStates Military Academy, West Point,NY, USA.e-mail: kevin.cummiskey@usma.eduThis paper discusses the influence that decisions about data cleaning and violations ofstatistical assumptions can have on drawing valid conclusions to research studies. Thedatasets provided in this paper were collected as part of a National Science Foundationgrant to design online games and associated labs for use in undergraduate and graduatestatistics courses that can effectively illustrate issues not always addressed in traditionalinstruction. Students play the role of a researcher by selecting from a wide variety of independentvariables to explain why some students complete games faster than others.Typicalproject data sets are “messy,” with many outliers (usually from some students taking muchlonger than others) and distributions that do not appear normal. Classroom testing of thegames over several semesters has produced evidence of their efficacy in statistics education.The projects tend to be engaging for students and they make the impact of datacleaning and violations of model assumptions more relevant. We discuss the use of oneof the games and associated guided lab in introducing students to issues prevalent in realdata and the challenges involved in data cleaning and dangers when model assumptionsare violated.Keywords: Guided Interdisciplinary Statistics Games and Labs, messy data, model assumptionsINTRODUCTIONThe decisions that researchers make when analyzing their data canhave significant impacts on the conclusions of a scientific study.In most cases, methods exist for checking model assumptions,but there are few absolute rules for determining when assumptionsare violated and what to do in those cases. For example,when using t -tests or ANOVA, decisions about normality, equalvariances, or how to handle outliers are often left to the discretionof the researcher. While many statistics courses discuss modelassumptions and data cleaning (such as removing outliers or erroneousdata), students rarely face data analysis challenges wherethey must make and defend decisions. As a result, the impacts ofthese decisions are rarely discussed in detail.The topics of data cleaning and testing of assumptions are particularlyrelevant in light of the fact that there have been severalhigh-profile retractions of articles published in peer-reviewed psychologyjournals because of data related issues. In June 2012, Dr.Dirk Smeesters resigned from his position at Erasmus Universityand had a paper retracted from the Journal of Experimental Psychologyafter it was determined his data was statistically highlyunlikely. He admitted to removing some data points that did notsupport his hypothesis, claiming that this practice is commonin psychology and marketing research (Gever, 2012). Simmonset al. (2011) show that even when researchers have good intentions,they control so many conditions of the experiment thatthey are almost certain to show statistically significant evidencefor their hypothesis in at least one set of conditions. These conditionsinclude the size of the sample, which outliers are removed,and how the data is transformed. They argue that the ambiguityof how to make these decisions and the researcher’s desire toobtain significant results are the primary reasons for the largenumber of false positives in scientific literature (Simmons et al.,2011).Replication studies are designed to ensure the integrity of scientificresults and may help detect issues associated with the data andmodel assumptions. However, replication is not a panacea. Miller(2012) shows that the probability of replicating a significant effectis essentially unknowable to the researcher so the scientific communitymay not always correctly interpret replication study results.Furthering the difficulty in assessing the quality of researchers’decisions involving data is the fact that relatively few replicationstudies are done on published research. Journals prefer to publishoriginal research and rarely publish replications of previous studieseven if the replication shows no effect. Therefore, there is littleincentive for researchers to replicate others’ results which increasesthe likelihood that studies resulting in false positives are acceptedinto scientific journals. Dr. Brian Nosek and a group working onthe ambitious Reproducibility Project are attempting to replicateevery study published in three major psychology journals in 2008(Barlett, 2012).Through student generated datasets, we demonstrate how studentsin the same class, with the same raw dataset, and using thesame statistical technique can draw different conclusions withoutrealizing the assumptions they made in their analysis. There ismuch truth to Esar’s (1949) humorous saying “Statistics [is] theonly science that enables different experts using the same figuresto draw different conclusions.” Instead of having our studentsbelieve that statistics is simply a set of step-by-step calculations,www.frontiersin.org September 2012 | Volume 3 | Article 354 | 150
Page 2 and 3:
FRONTIERS COPYRIGHTSTATEMENT© Copy
Page 4 and 5:
Table of Contents05 Is Data Cleanin
Page 7 and 8:
OsborneAssumptions and data cleanin
Page 9 and 10:
ORIGINAL RESEARCH ARTICLEpublished:
Page 12 and 13:
Hoekstra et al.Why assumptions are
Page 14 and 15:
Page 16 and 17:
Page 18 and 19:
Page 20 and 21:
García-PérezStatistical conclusio
Page 22 and 23:
Page 24 and 25:
Page 26 and 27:
Page 28 and 29:
Page 30 and 31:
Sheng and ShengEffect of non-normal
Page 32 and 33:
Page 34 and 35:
Page 36 and 37:
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
REVIEW ARTICLEpublished: 12 April 2
Page 44 and 45:
Nimon et al.The assumption of relia
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
Page 54 and 55:
Page 56 and 57:
TressoldiPower replication unreliab
Page 58 and 59:
TressoldiPower replication unreliab
Page 60 and 61:
Page 62 and 63:
FinchModern methods for the detecti
Page 64 and 65:
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
Page 72 and 73:
MINI REVIEW ARTICLEpublished: 28 Au
Page 74 and 75:
NimonStatistical assumptionsand Del
Page 76 and 77:
NimonStatistical assumptionsFor exa
Page 78 and 79:
Kraha et al.Interpreting multiple r
Page 80 and 81:
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
SmithsonComparing moderation of slo
Page 96 and 97:
Page 98 and 99:
Page 100 and 101: SmithsonComparing moderation of slo
Page 102 and 103: REVIEW ARTICLEpublished: 01 March 2
Page 104 and 105: Flora et al.Factor analysis assumpt
Page 124 and 125: Kasper and ÜnlüAssumptions of fac
Page 144 and 145: Lages and JaworskaHow predictable a
Page 152 and 153: Cummiskey et al.Testing assumptions
Page 158: Cummiskey et al.Testing assumptions
show all

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?