Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

More documents

Recommendations

Info

Nimon et al.The assumption of reliable datadata, the effect of error on observed score correlation is morecomplicated than Eqs 2 or 3 suggest. In fact, it is not uncommonfor observed score correlations to be greater than the square rootof the product of their reliabilities (e.g., see Dozois et al., 1998).In such cases, Spearman’s correction formula (Eq. 2) will resultin correlations greater than 1.00. While literature indicates thatsuch correlations should be truncated to unity (e.g., Onwuegbuzieet al., 2004), a truncated correlation of 1.00 may be a less accurateestimate of the true score correlation than its observed score counterpart.As readers will see, observed score correlations may be lessthan or greater than their true score counterparts and therefore lessor more accurate than correlations adjusted by Spearman’s (1904)formula.To understand why observed score correlations may not alwaysbe less than their true score counterparts,we present Charles’s“correctionfor the full effect of measurement error” (Charles, 2005,p.226). Although his formula cannot be used when “true scores anderror scores are unknown,”the formula clarifies the roles that reliabilityand error play in the formation of observed score correlationand identifies “the assumptions made in the derivation of thecorrection for attenuation” formula (Zimmerman, 2007, p. 923).Moreover, in the case when observed scores are available and trueand error scores are hypothesized, the quantities in his formulacan be given specific values and the full effect of measurementerror on sample data can be observed.Charles’s (2005) formula extends Spearman’s (1904) formula bytaking into account correlations between error scores and betweentrue scores and error scores that can occur in sample data:r TX T Y=r O X O Y √rXX − r √ √E X E Y eXX eYY√ − r √T X E Y eYY√ − r √T Y E X eXX√r YY rXX r YY rYY rXX(4)Although not explicit, Charles’s formula considers the correlationsthat exist between true scores and error scores of individual measuresby defining error (e.g., e XX , e YY ) as the ratio between errorand observed score variance. Although error is traditionally representedas1–reliability(e.g., 1 − r XX ), such representation isonly appropriate for population data as the correlation betweentrue scores and error scores for a given measure (e.g., r TX E X)isassumed to be 0 in the population. Just as with r EX E Y, r TX E Y, r TY E X,correlations between true and error scores of individual measures(r TX E X, r TY E Y) are not necessarily 0 in sample data. Positive correlationsbetween true and error scores result in errors (e.g., e XX )that are less than1–reliability (e.g., 1 − r xx ), while negative correlationsresult in errors that are greater than1–reliability, asindicated in the following formula (Charles, 2005):()e XX = 1 − r XX − cov T X E XSO 2 (5)XThrough a series of simulation tests, Zimmerman (2007)demonstrated that an equivalent form of Eq. 4 accurately producestrue score correlations for sample data and unlike Spearman’s(1904) formula, always yields correlation coefficients between−1.00 and 1.00. From Eq. 4, one sees that Spearman’s formularesults in over corrected correlations when r EX E Y, r TX E Y, and r TY E Xare greater than 0, and under-corrected correlations when they areless than 0.By taking Eq. 4 and solving for r OX O Y, one also sees thatthe effect of unreliable data is more complicated than what isrepresented in Eq. 3:√ √ √ √ √r OX O Y= r TX T Y rXX r YY + r EX E Y eXX eYY + r TX E Y eYY rXX√ √+ r TY E X eXX rYY (6)Equation 6 demonstrates why observed score correlations can begreater than the square root of the product of reliabilities andthat the full effect of unreliable data on observed score correlationextends beyond true score correlation and includes the correlationbetween error scores, and correlations between true scoresand error scores.To illustrate the effect of unreliable data on observed score correlation,consider the case where r TX T Y= 0.50, r XX = r YY = 0.80,r EX E Y= 0.50, and r TX E Y= r TY E X= 0.10. For the sakeof parsimony, we assume r TX E X= r TY E Y= 0 and thereforethat e XX = 1 − r XX and e YY = 1 − r YY . Based on Eq. 3, onewould expect that the observed score correlation to be 0.40(0.50 √ 0.80 × 0.80). However, as can be seen via the boxed pointsin Figure 3, the effect of correlated error, the correlation betweenT X and E Y , and the correlation between T Y and E X respectivelyincrease the expected observed score correlation by 0.10, 0.04, and0.04 resulting in an observed score correlation of 0.58, which isgreater than the true score correlation of 0.50, and closer to thetrue score correlation of 0.50 than the Spearman (1904) correctionresulting from Eq. 2 which equals 0.725 (i.e., 0.58/ √ 0.80 × 0.80).This example shows that the attenuating effect of unreliable data(first term inEq. 6) is mitigated by the effect of correlated error(second term in Eq. 6) and the effects of correlations between trueand error scores (third and forth terms in Eq. 6), assuming that thecorrelations are in the positive direction. Correlations in the negativedirection serve to further attenuate the true score correlationbeyond the first term in Eq. 6. This example further shows thatobserved score correlations are not always attenuated by measurementerror and that in some cases an observed score correlationmay provide an estimate that is closer to the true score correlationthan a correlation that has been corrected by Spearman’s formula.As illustrated in Figure 3, the effect of correlated error andcorrelations between true and error scores tend to increase as reliabilitydecreases and the magnitudes of r TX E Y, r TY E X, and r EX E Yincrease. The question, of course, is how big are these so-called“nuisance correlations” in real sample data? One can expect that,on average, repeated samples of scores would yield correlationsof 0 for r TX E Yand r TY E X, as these correlations are assumed to be0 in the population (Zimmerman, 2007). However, correlationsbetween errors scores are not necessarily 0 in the population. Correlationbetween error scores can arise, for example, whenevertests are administered on the same occasion, consider the sameconstruct, or are based on the same set of items (Zimmerman andWilliams, 1977). In such cases, one can expect that, on average,repeated samples of error scores would approximate the level ofcorrelated error in the population (Zimmerman, 2007). One canalso expect that the variability of these correlations would increasewww.frontiersin.org April 2012 | Volume 3 | Article 102 | 43
Nimon et al.The assumption of reliable dataFIGURE 3|Effectofunreliable data on observed score correlationas a function of reliability, correlated error (r EX E Y), and correlationsbetween true and error scores (r TX E Y, r TY E X). Note: r TX E X= r TY E Y= 0;e XX = e YY = 1 − reliability. The boxed point on the left-hand panelindicates the effect of correlated error on an observed scorecorrelation when r TX T Y= 0.50, r XX = r YY = 0.80, and r EX E Y= 0.50. Theboxed point on the right-hand panel indicates the effect of acorrelation between true and error scores (e.g., r TY E X) on an observedscore correlation when r TX T Y= 0.50, r XX = r YY = 0.80, andr TX E Y= r TY E X= 0.10.Table1|SATdata observed, error, and true scores.Writing scoresReading scoresState Observed (O Y ) True (T Y ) Error (E Y ) Observed (O Y ) True (T Y ) Error (E Y )Connecticut 513.0 512.0 1.0 509.0 509.8 −0.8Delaware 476.0 485.0 −9.0 489.0 495.8 −6.8Georgia 473.0 481.2 −8.2 485.0 491.4 −6.4Maryland 491.0 496.4 −5.4 499.0 500.6 −1.6Massachusetts 509.0 510.6 −1.6 513.0 513.2 −0.2New Hampshire 511.0 510.4 0.6 523.0 521.0 2.0New Jersey 497.0 495.8 1.2 495.0 495.4 −0.4New York 476.0 480.4 −4.4 485.0 488.2 −3.2North Carolina 474.0 481.2 −7.2 493.0 495.6 −2.6Pennsylvania 479.0 482.2 −3.2 493.0 493.0 0.0Rhode Island 489.0 491.4 −2.4 495.0 495.6 −0.6South Carolina 464.0 473.8 −9.8 482.0 486.6 −4.6Virginia 495.0 498.4 −3.4 512.0 511.4 0.6M 488.2 492.2 −4.0 497.9 499.9 −1.9SD 16.1 12.9 3.8 12.6 10.6 2.7as the sample size (n) decreases. Indeed,Zimmerman (2007) foundthat the distributions for r TX E Y, r TY E X, and r EX E Yyielded SDs of∼ 1/ √ n − 1. The fact that there is sampling variance in these valuesmakes “dealing with measurement error and sampling errorpragmatically inseparable” (Charles, 2005, p. 226).To empirically illustrate the full effect of unreliable data onobserved score correlation, we build on the work of Wetcher-Hendricks (2006) and apply Eq. 6 to education and psychologyexamples. For education data, we analyzed SAT writing and readingscores (Benefield, 2011; College Board, 2011; Public Agenda,2011). For psychology data, we analyzed scores from the BeckDepression Inventory-II (BDI-II; Beck, 1996) and Beck AnxietyInventory (BAI; Beck, 1990). We close this section by contrastingCTT assumptions relating to reliability to population and sampledata and summarizing how differences in those assumptionsimpact the full effect of reliability on observed score correlations.EDUCATION EXAMPLEWe applied Eq. 6 to average SAT scores from 13 states associatedwith the original USA colonies (see Table 1). We selectedthese states as they were a cohesive group and were among the 17states with the highest participation rates (Benefield, 2011; CollegeFrontiers in Psychology | Quantitative Psychology and Measurement April 2012 | Volume 3 | Article 102 | 44
Page 2 and 3: FRONTIERS COPYRIGHTSTATEMENT© Copy
Page 4 and 5: Table of Contents05 Is Data Cleanin
Page 7 and 8: OsborneAssumptions and data cleanin
Page 9 and 10: ORIGINAL RESEARCH ARTICLEpublished:
Page 12 and 13: Hoekstra et al.Why assumptions are
Page 20 and 21: García-PérezStatistical conclusio
Page 30 and 31: Sheng and ShengEffect of non-normal
Page 42 and 43: REVIEW ARTICLEpublished: 12 April 2
Page 46 and 47: Nimon et al.The assumption of relia
Page 56 and 57: TressoldiPower replication unreliab
Page 58 and 59: TressoldiPower replication unreliab
Page 62 and 63: FinchModern methods for the detecti
Page 72 and 73: MINI REVIEW ARTICLEpublished: 28 Au
Page 74 and 75: NimonStatistical assumptionsand Del
Page 76 and 77: NimonStatistical assumptionsFor exa
Page 78 and 79: Kraha et al.Interpreting multiple r
Page 94 and 95:
SmithsonComparing moderation of slo
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
REVIEW ARTICLEpublished: 01 March 2
Page 104 and 105:
Flora et al.Factor analysis assumpt
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Kasper and ÜnlüAssumptions of fac
Page 126 and 127:
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Lages and JaworskaHow predictable a
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Cummiskey et al.Testing assumptions
Page 154 and 155:
Page 156 and 157:
Page 158:
show all

Sweating the Small Stuff: Does data cleaning and testing ... - Frontiers

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?