13.07.2015 Views

Contents

Contents

Contents

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 12: Data Analysis and Interpretation: Part II. Tests of Statistical Significance and the Analysis Story 393BOX 12.3WHAT WE SHOULD NOT SAY WHEN A RESULT IS STATISTICALLYSIGNIFICANT ( p .05)• We cannot specify the exact probability for the realdif ference between the means. For example, it iswrong to say that the probability is .95 that the observeddifference between the means reflects a real(true) mean difference in the populations.The outcome of NHST reveals the probabilityof a difference this great by chance (giventhese data) assuming the null hypothesis is true.It does not tell us about probabilities in the realworld (e.g., Mulaik et al., 1997). If results occurwith a probability less than our chosen alphalevel (e.g., .05), then all we can conclude is thatthe outcome is not likely to be a chance event inthis situation.• Statistically significant results do not demonstratethat the research hypothesis is correct. (For example,the data from the vocabulary study do not prove thatolder adults have greater vocabulary knowledge thando younger adults.)NHST (as well as confidence intervals) cannotprove that a research hypothesis is correct.A statistically significant result is (reasonably)sometimes said to “provide support for” or to“give evidence for” a hypothesis, but it alonecannot prove that the research hypothesis iscorrect. There are a couple of important reasonswhy. First, NHST is a game of probabilities; itprovides answers in the form of likelihoods thatare never 1.00 (e.g., p greater or less than .05).There is always the possibility of error. If there is“proof,” it is only “circumstantial” proof. As wehave seen, the research hypothesis can only betested indirectly by referring to the probability ofthese data assuming the null hypothesis is true.If the probability that our results occurred bychance is very low (assuming a true null hypothesis),we may reason that the null hypothesisis really not true; this does not, however, meanour research hypothesis is true. As Schmidtand Hunter (1997, p. 59) remind us, researchersdoing NHST “are focusing not on the actual scientifichypothesis of interest.” Second, evidencefor the effect of an independent variable is onlyas good as the methodology that produced theeffect. The data used in NHST may or may notbe from a study that is free of confounds or experimentererrors. It is possible that another factorwas responsible for the observed effect. (Forexample, suppose that the older adults in thevocabulary study, but not the college students,had been recruited from a group of expert crosswordpuzzle players.) As we have mentioned, alarge effect size can easily be produced by a badexperiment. Evidence for a research hypothesismust be sought by examining the methodologyof a study as well as considering the effect producedon the dependent variable. Neither NHST,confidence intervals, nor effect sizes tell us aboutthe soundness of a study’s methodology.significant. We are not likely to carry out such an empirical test, however, ifthe effect size is small (although see Rosenthal, 1990, for important exceptions).RECOMMENDATIONS FOR COMPARING TWO MEANSWe offer the following recommendations when evaluating the data from astudy looking at the difference between two means. First, keep in mind the finalgoal of data analysis: to make a case based on our observations for a claim aboutbehavior. In order to make the best case possible, you will want to explore variousalternatives for data analysis. Don’t fall into the trap of thinking that thereis one and only one way to provide evidence for a claim about behavior. Whenthere is a choice (and there almost always is), as recommended by the APA’s

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!