10.07.2015 Views

Testing Moderator and Mediator Effects in Counseling Psychology ...

Testing Moderator and Mediator Effects in Counseling Psychology ...

Testing Moderator and Mediator Effects in Counseling Psychology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Journal of Counsel<strong>in</strong>g <strong>Psychology</strong>Copyright 2004 by the American Psychological Association, Inc.2004, Vol. 51, No. 1, 115–134 0022-0167/04/$12.00 DOI: 10.1037/0022-0167.51.1.115<strong>Test<strong>in</strong>g</strong> <strong>Moderator</strong> <strong>and</strong> <strong>Mediator</strong> <strong>Effects</strong> <strong>in</strong> Counsel<strong>in</strong>g <strong>Psychology</strong>ResearchPatricia A. FrazierUniversity of M<strong>in</strong>nesotaAndrew P. TixAugsburg CollegeKenneth E. BarronJames Madison UniversityThe goals of this article are to (a) describe differences between moderator <strong>and</strong> mediator effects; (b)provide nontechnical descriptions of how to exam<strong>in</strong>e each type of effect, <strong>in</strong>clud<strong>in</strong>g study design, analysis,<strong>and</strong> <strong>in</strong>terpretation of results; (c) demonstrate how to analyze each type of effect; <strong>and</strong> (d) providesuggestions for further read<strong>in</strong>g. The authors focus on the use of multiple regression because it is anaccessible data-analytic technique conta<strong>in</strong>ed <strong>in</strong> major statistical packages. When appropriate, they alsonote limitations of us<strong>in</strong>g regression to detect moderator <strong>and</strong> mediator effects <strong>and</strong> describe alternativeprocedures, particularly structural equation model<strong>in</strong>g. F<strong>in</strong>ally, to illustrate areas of confusion <strong>in</strong> counsel<strong>in</strong>gpsychology research, they review research test<strong>in</strong>g moderation <strong>and</strong> mediation that was published <strong>in</strong>the Journal of Counsel<strong>in</strong>g <strong>Psychology</strong> dur<strong>in</strong>g 2001.If you ask students or colleagues to describe the differencesbetween moderator <strong>and</strong> mediator effects <strong>in</strong> counsel<strong>in</strong>g psychologyresearch, their eyes are likely to glaze over. Confusion over themean<strong>in</strong>g of, <strong>and</strong> differences between, these terms is evident <strong>in</strong>counsel<strong>in</strong>g psychology research as well as research <strong>in</strong> other areasof psychology (Baron & Kenny, 1986; Holmbeck, 1997; James &Brett, 1984). This is unfortunate, because both types of effects holdmuch potential for further<strong>in</strong>g our underst<strong>and</strong><strong>in</strong>g of a variety ofpsychological phenomena of <strong>in</strong>terest to counsel<strong>in</strong>g psychologists.Given this, our goals here are to (a) describe differences betweenmoderator <strong>and</strong> mediator effects; (b) provide nontechnical,step-by-step descriptions of how to exam<strong>in</strong>e each type of effect,<strong>in</strong>clud<strong>in</strong>g issues related to study design, analysis, <strong>and</strong> <strong>in</strong>terpretationof results; (c) demonstrate how to analyze each type of effectthrough the use of detailed examples; <strong>and</strong> (d) provide suggestions<strong>and</strong> references for further read<strong>in</strong>g. We focus on the use of multipleregression because it is an accessible data-analytic technique conta<strong>in</strong>ed<strong>in</strong> major statistical packages that can be used to exam<strong>in</strong>eboth moderator <strong>and</strong> mediator effects (Aiken & West, 1991; Baron& Kenny, 1986; Cohen, Cohen, West, & Aiken, 2003; Jaccard,Turrisi, & Wan, 1990). When appropriate, however, we also notePatricia A. Frazier, Department of <strong>Psychology</strong>, University of M<strong>in</strong>nesota;Andrew P. Tix, Department of <strong>Psychology</strong>, Augsburg College; KennethE. Barron, Department of <strong>Psychology</strong>, James Madison University.We thank Michele Kielty Briggs, Bryan Dik, Richard Lee, HeatherMortensen, Jason Steward, Ty Tashiro, <strong>and</strong> Missy West for their commentson an earlier version of this article <strong>and</strong> David Herr<strong>in</strong>g for his assistancewith the simulated data.Correspondence concern<strong>in</strong>g this article should be addressed to PatriciaA. Frazier, Department of <strong>Psychology</strong>, University of M<strong>in</strong>nesota, N218Elliott Hall, 75 East River Road, M<strong>in</strong>neapolis, MN 55455. E-mail:pfraz@umn.edulimitations of us<strong>in</strong>g multiple regression to detect moderator <strong>and</strong>mediator effects <strong>and</strong> describe alternative procedures, particularlystructural equation model<strong>in</strong>g (SEM). In addition, to illustrate areasof confusion <strong>in</strong> counsel<strong>in</strong>g psychology research, we review researchtest<strong>in</strong>g moderation <strong>and</strong> mediation that was published <strong>in</strong> theJournal of Counsel<strong>in</strong>g <strong>Psychology</strong> (JCP) dur<strong>in</strong>g 2001. F<strong>in</strong>ally, wewant to stress that, although our goal was to summarize <strong>in</strong>formationon current best practices <strong>in</strong> analyz<strong>in</strong>g moderator <strong>and</strong> mediatoreffects, we strongly encourage readers to consult the primarysources we reference (<strong>and</strong> new sources as they emerge) to ga<strong>in</strong> abetter underst<strong>and</strong><strong>in</strong>g of the issues <strong>in</strong>volved <strong>in</strong> conduct<strong>in</strong>g suchtests.DIFFERENCES BETWEEN MODERATOR ANDMEDIATOR EFFECTSConsider, for a moment, your primary area of research <strong>in</strong>terest.More than likely, whatever doma<strong>in</strong> you identify <strong>in</strong>cludes researchquestions of the form “Does variable X predict or cause variableY?” 1 Clearly, questions of this form are foundational to counsel<strong>in</strong>gpsychology. Examples <strong>in</strong>clude correlational questions such as“What client factors are related to counsel<strong>in</strong>g outcomes?” as wellas causal questions such as “Does a certa<strong>in</strong> counsel<strong>in</strong>g <strong>in</strong>tervention(e.g., cognitive therapy) <strong>in</strong>crease well-be<strong>in</strong>g?” (see Figure 1A fora diagram). However, to advance counsel<strong>in</strong>g theory, research, <strong>and</strong>practice, it is important to move beyond these basic questions. One1 For the sake of simplicity, we generally use the term predictor variableto refer to both a predictor variable <strong>in</strong> correlational research <strong>and</strong> an<strong>in</strong>dependent variable <strong>in</strong> experimental research. Likewise, we generally usethe term outcome variable to refer to both an outcome variable <strong>in</strong> correlationalresearch <strong>and</strong> a dependent variable <strong>in</strong> experimental research.115


116 FRAZIER, TIX, AND BARRONFigure 1.Diagrams of direct, moderator, <strong>and</strong> mediator effects.way to do this is by exam<strong>in</strong><strong>in</strong>g moderators <strong>and</strong> mediators of theseeffects.Questions <strong>in</strong>volv<strong>in</strong>g moderators address “when” or “for whom”a variable most strongly predicts or causes an outcome variable.More specifically, a moderator is a variable that alters the directionor strength of the relation between a predictor <strong>and</strong> an outcome(Baron & Kenny, 1986; Holmbeck, 1997; James & Brett, 1984).Thus, a moderator effect is noth<strong>in</strong>g more than an <strong>in</strong>teractionwhereby the effect of one variable depends on the level of another.For example, counsel<strong>in</strong>g researchers have long been admonishedto <strong>in</strong>vestigate not only the general effectiveness of <strong>in</strong>terventionsbut which <strong>in</strong>terventions work best for which people (see Norcross,2001, for a recent review of <strong>in</strong>teraction effects <strong>in</strong> treatment outcomestudies). For example, <strong>in</strong> Figure 1B, gender (variable Z) is<strong>in</strong>troduced as a moderator of the relation between counsel<strong>in</strong>gcondition <strong>and</strong> well-be<strong>in</strong>g. If gender is a significant moderator <strong>in</strong>this case, the counsel<strong>in</strong>g <strong>in</strong>tervention <strong>in</strong>creases well-be<strong>in</strong>g morefor one gender than for the other (e.g., it <strong>in</strong>creases well-be<strong>in</strong>g morefor women than for men). Such <strong>in</strong>teraction effects (i.e., moderators)are important to study because they are common <strong>in</strong> psychologicalresearch, perhaps even the rule rather than the exception(Jaccard et al., 1990). If moderators are ignored <strong>in</strong> treatmentstudies, participants may be given a treatment that is <strong>in</strong>appropriateor perhaps even harmful for them (Kraemer, Stice, Kazd<strong>in</strong>, Offord,& Kupfer, 2001).Interaction effects are not only important for <strong>in</strong>tervention studies,however. There are many other <strong>in</strong>stances <strong>in</strong> which researchersare <strong>in</strong>terested <strong>in</strong> whether relations between predictor <strong>and</strong> outcomevariables are stronger for some people than for others. The identificationof important moderators of relations between predictors<strong>and</strong> outcomes <strong>in</strong>dicates the maturity <strong>and</strong> sophistication of a field of<strong>in</strong>quiry (Agu<strong>in</strong>is, Boik, & Pierce, 2001; Judd, McClell<strong>and</strong>, &Culhane, 1995) <strong>and</strong> is at the heart of theory <strong>in</strong> social science(Cohen et al., 2003). A recent example <strong>in</strong> JCP illustrates the ways<strong>in</strong> which exam<strong>in</strong><strong>in</strong>g moderator effects can <strong>in</strong>crease our underst<strong>and</strong><strong>in</strong>gof the relations between important predictors <strong>and</strong> outcomes.Specifically, Corn<strong>in</strong>g (2002) found that perceived discrim<strong>in</strong>ationwas positively related to psychological distress only among<strong>in</strong>dividuals with low self-esteem (<strong>and</strong> not among <strong>in</strong>dividuals withhigh self-esteem). Thus, self-esteem “buffered” the effects of discrim<strong>in</strong>ationon distress.Whereas moderators address “when” or “for whom” a predictoris more strongly related to an outcome, mediators establish “how”or “why” one variable predicts or causes an outcome variable.More specifically, a mediator is def<strong>in</strong>ed as a variable that expla<strong>in</strong>sthe relation between a predictor <strong>and</strong> an outcome (Baron & Kenny,1986; Holmbeck, 1997; James & Brett, 1984). In other words, amediator is the mechanism through which a predictor <strong>in</strong>fluences anoutcome variable (Baron & Kenny, 1986). In Figure 1C, socialsupport (variable M) is <strong>in</strong>troduced as a mediator of the relationbetween counsel<strong>in</strong>g condition <strong>and</strong> well-be<strong>in</strong>g. If social support isa significant mediator <strong>in</strong> this case, the reason the treatment grouphas higher well-be<strong>in</strong>g scores is that participants <strong>in</strong> this group reportgreater <strong>in</strong>creases <strong>in</strong> social support than do those <strong>in</strong> the controlcondition. Alternatively, social support might be a significantmediator if those <strong>in</strong> the control condition reported greater decreases<strong>in</strong> social support than those <strong>in</strong> the treatment condition (i.e.,the reason the treatment was effective was that it prevented decreases<strong>in</strong> social support). With<strong>in</strong> the context of evaluat<strong>in</strong>g counsel<strong>in</strong>g<strong>in</strong>terventions, measur<strong>in</strong>g underly<strong>in</strong>g change mechanisms(i.e., mediators) as well as outcomes provides <strong>in</strong>formation onwhich mechanisms are critical for <strong>in</strong>fluenc<strong>in</strong>g outcomes (Mac-K<strong>in</strong>non & Dwyer, 1993). This <strong>in</strong>formation can enable us to focuson the effective components of treatments <strong>and</strong> remove the <strong>in</strong>effectivecomponents (MacK<strong>in</strong>non, 2000) as well as to build <strong>and</strong> testtheory regard<strong>in</strong>g the causal mechanisms responsible for change(Judd & Kenny, 1981).Furthermore, when there are cha<strong>in</strong>s of mediators, address<strong>in</strong>gonly one l<strong>in</strong>k <strong>in</strong> the cha<strong>in</strong> may limit treatment effectiveness,whereas sequential <strong>in</strong>terventions that address each l<strong>in</strong>k may bemore successful (Kraemer et al., 2001). As was the case withmoderator research, test<strong>in</strong>g mediators also is important outside ofevaluat<strong>in</strong>g <strong>in</strong>terventions. It is a sign of a matur<strong>in</strong>g discipl<strong>in</strong>e when,after direct relations have been demonstrated, we have turned toexplanation <strong>and</strong> theory test<strong>in</strong>g regard<strong>in</strong>g those relations (Hoyle &Kenny, 1999). For example, <strong>in</strong> a recent JCP article, Lee, Draper,<strong>and</strong> Lee (2001) found that the negative relation between socialconnectedness <strong>and</strong> distress was mediated by dysfunctional <strong>in</strong>terpersonalbehaviors. In other words, <strong>in</strong>dividuals low <strong>in</strong> connectednessreported more distress <strong>in</strong> part because they also engaged <strong>in</strong>more dysfunctional behaviors.A given variable may function as either a moderator or amediator, depend<strong>in</strong>g on the theory be<strong>in</strong>g tested. For example,social support could be conceptualized as a moderator of therelation between counsel<strong>in</strong>g condition <strong>and</strong> well-be<strong>in</strong>g. This wouldbe the case if theory suggested that the <strong>in</strong>tervention might bedifferentially effective for <strong>in</strong>dividuals high <strong>and</strong> low <strong>in</strong> socialsupport. Social support also could be conceptualized as a mediatorof the relation between counsel<strong>in</strong>g condition <strong>and</strong> well-be<strong>in</strong>g, as itis depicted <strong>in</strong> Figure 1C. In this case, theory would suggest that thereason counsel<strong>in</strong>g is effective is that it <strong>in</strong>creases social support.Thus, the same variable could be cast as a moderator or a mediator,depend<strong>in</strong>g on the research question <strong>and</strong> the theory be<strong>in</strong>g tested.Although this can be confus<strong>in</strong>g, it is helpful to keep <strong>in</strong> m<strong>in</strong>d that


MODERATOR AND MEDIATOR EFFECTS117moderators often are <strong>in</strong>troduced when there are unexpectedly weakor <strong>in</strong>consistent relations between a predictor <strong>and</strong> an outcomeacross studies (Baron & Kenny, 1986). Thus, one might look formoderators if the evidence for the effectiveness of a given <strong>in</strong>terventionis weak, which may be because it is effective only forsome people. The choice of moderators should be based on aspecific theory regard<strong>in</strong>g why the <strong>in</strong>tervention may be more effectivefor some people than for others. In contrast, one typicallylooks for mediators if there already is a strong relation between apredictor <strong>and</strong> an outcome <strong>and</strong> one wishes to explore the mechanismsbeh<strong>in</strong>d that relation. In the counsel<strong>in</strong>g example, if there issolid evidence that an <strong>in</strong>tervention is effective, one might want totest a specific theory about what makes the <strong>in</strong>tervention effective.In short, decisions about potential moderators <strong>and</strong> mediatorsshould be based on previous research <strong>and</strong> theory <strong>and</strong> are best madea priori <strong>in</strong> the design stage rather than post hoc.One also can exam<strong>in</strong>e mediators <strong>and</strong> moderators with<strong>in</strong> thesame model. Moderated mediation refers to <strong>in</strong>stances <strong>in</strong> which themediated relation varies across levels of a moderator. Mediatedmoderation refers to <strong>in</strong>stances <strong>in</strong> which a mediator variable expla<strong>in</strong>sthe relation between an <strong>in</strong>teraction term <strong>in</strong> a moderatormodel <strong>and</strong> an outcome. These more complex models have beendescribed <strong>in</strong> more detail elsewhere (e.g., Baron & Kenny, 1986;Hoyle & Rob<strong>in</strong>son, <strong>in</strong> press; James & Brett, 1984; Wegener &Fabrigar, 2000).MODERATOR EFFECTSResearchers can use multiple regression to exam<strong>in</strong>e moderatoreffects whether the predictor or moderator variables are categorical(e.g., sex or race) or cont<strong>in</strong>uous (e.g., age). 2 When both thepredictor <strong>and</strong> moderator variables are categorical, analysis of variance(ANOVA) procedures also can be used, although multipleregression is preferred because of the flexibility <strong>in</strong> options itprovides for cod<strong>in</strong>g categorical variables (Cohen et al., 2003).When one or both variables are measured on a cont<strong>in</strong>uous scale,regression procedures that reta<strong>in</strong> the cont<strong>in</strong>uous nature of thevariables clearly are preferred over us<strong>in</strong>g cut po<strong>in</strong>ts (e.g., mediansplits) to create artificial groups to compare correlations betweengroups or exam<strong>in</strong>e <strong>in</strong>teraction effects us<strong>in</strong>g ANOVA (Aiken &West, 1991; Cohen, 1983; Cohen et al., 2003; Jaccard et al., 1990;Judd et al., 1995; MacCallum, Zhang, Preacher, & Rucker, 2002;Maxwell & Delaney, 1993; West, Aiken, & Krull, 1996). This isbecause the use of cut po<strong>in</strong>ts to create artificial groups fromvariables actually measured on a cont<strong>in</strong>uous scale results <strong>in</strong> a lossof <strong>in</strong>formation <strong>and</strong> a reduction <strong>in</strong> power to detect <strong>in</strong>teractioneffects.However, artificially dichotomiz<strong>in</strong>g two cont<strong>in</strong>uous variables(e.g., a predictor <strong>and</strong> a moderator) also can have the opposite effect<strong>and</strong> can lead to spurious ma<strong>in</strong> <strong>and</strong> <strong>in</strong>teraction effects (MacCallumet al., 2002). Simulation studies have shown that hierarchicalmultiple regression procedures that reta<strong>in</strong> the true nature of cont<strong>in</strong>uousvariables result <strong>in</strong> fewer Type I <strong>and</strong> Type II errors fordetect<strong>in</strong>g moderator effects relative to procedures that <strong>in</strong>volve theuse of cut po<strong>in</strong>ts (Bissonnette, Ickes, Bernste<strong>in</strong>, & Knowles, 1990;Mason, Tu, & Cauce, 1996; Stone-Romero & Anderson, 1994).Statisticians also generally have encouraged the use of hierarchicalregression techniques over the practice of compar<strong>in</strong>g correlationsbetween groups when the group variable is naturally categorical(e.g., sex or race), because different correlations between groupsmay reflect differential variances between groups rather than truemoderator effects (Baron & Kenny, 1986; Chapl<strong>in</strong>, 1991; Judd etal., 1995). Unfortunately, <strong>in</strong> contrast to these recommendations,MacCallum et al. (2002) concluded that JCP was one of threelead<strong>in</strong>g journals <strong>in</strong> psychology <strong>in</strong> which the dichotomization ofcont<strong>in</strong>uous variables was a relatively common practice. Our reviewof research published <strong>in</strong> JCP <strong>in</strong> 2001 also suggested that themajority of researchers who tested <strong>in</strong>teractions <strong>in</strong>volv<strong>in</strong>g cont<strong>in</strong>uousvariables dichotomized those variables <strong>and</strong> used ANOVArather than regression.Guide to <strong>Test<strong>in</strong>g</strong> <strong>Moderator</strong> <strong>Effects</strong> <strong>in</strong> MultipleRegressionIn this section, we first present a guide for us<strong>in</strong>g hierarchicalmultiple regression to exam<strong>in</strong>e moderator effects, <strong>in</strong>clud<strong>in</strong>g issuesrelated to design<strong>in</strong>g the study, analyz<strong>in</strong>g the data, <strong>and</strong> <strong>in</strong>terpret<strong>in</strong>gthe results. This is followed by a discussion of additional issues toconsider when exam<strong>in</strong><strong>in</strong>g moderator effects us<strong>in</strong>g regression techniques.We then provide an example that illustrates the steps<strong>in</strong>volved <strong>in</strong> perform<strong>in</strong>g a moderator analysis. Although we focuson test<strong>in</strong>g moderator effects, many of the issues we raise apply toregression analyses more generally.To identify aspects of test<strong>in</strong>g moderation about which there maybe confusion <strong>in</strong> counsel<strong>in</strong>g psychology research, we also performeda manual review of all articles published <strong>in</strong> the 2001 issuesof JCP. In total, 54 articles appeared <strong>in</strong> 2001, <strong>in</strong>clud<strong>in</strong>g regulararticles, comments, replies, <strong>and</strong> brief reports. Of these 54, 12(22%) conta<strong>in</strong>ed a test of an <strong>in</strong>teraction (although the study wasnot always framed as a test of a moderation hypothesis). Only 4 ofthe 12 used multiple regression with an <strong>in</strong>teraction term to testmoderation, which is the procedure we describe subsequently.Although this sample of articles is small, our read<strong>in</strong>g of thesearticles suggested po<strong>in</strong>ts of confusion, as noted <strong>in</strong> the discussion tofollow. The results of our review are reported on a general level toavoid s<strong>in</strong>gl<strong>in</strong>g out particular studies or authors.Design<strong>in</strong>g a Study to Test ModerationImportance of TheoryAll of the study design decisions outl<strong>in</strong>ed next should be madeon the basis of a well-def<strong>in</strong>ed theory, which unfortunately is notoften the case (Chapl<strong>in</strong>, 1991). For example, both the choice of amoderator <strong>and</strong> the hypothesized nature of the <strong>in</strong>teraction should bebased on theory (Jaccard et al., 1990). Cohen et al. (2003, pp.285–286) described three patterns of <strong>in</strong>teractions among two cont<strong>in</strong>uousvariables: enhanc<strong>in</strong>g <strong>in</strong>teractions (<strong>in</strong> which both the predictor<strong>and</strong> moderator affect the outcome variable <strong>in</strong> the samedirection <strong>and</strong> together have a stronger than additive effect), buffer<strong>in</strong>g<strong>in</strong>teractions (<strong>in</strong> which the moderator variable weakens the2 Baron <strong>and</strong> Kenny (1986) dist<strong>in</strong>guished situations <strong>in</strong> which the predictoris cont<strong>in</strong>uous <strong>and</strong> the moderator is categorical from situations <strong>in</strong> whichthe predictor is categorical <strong>and</strong> the moderator is cont<strong>in</strong>uous. However, theanalyses are the same if it is assumed, as typically is the case, that the effectof the predictor on the outcome variable changes l<strong>in</strong>early with respect tothe moderator.


118 FRAZIER, TIX, AND BARRONeffect of the predictor variable on the outcome), <strong>and</strong> antagonistic<strong>in</strong>teractions (<strong>in</strong> which the predictor <strong>and</strong> moderator have the sameeffect on the outcome but the <strong>in</strong>teraction is <strong>in</strong> the opposite direction).Similarly, <strong>in</strong> the case of one categorical <strong>and</strong> one cont<strong>in</strong>uousvariable, the theory on which the hypotheses are based may specifythat a predictor is positively related to an outcome for one group<strong>and</strong> unrelated for another group. Alternatively, theory may specifythat a predictor is positively related to outcomes for one group <strong>and</strong>negatively related to outcomes for another. F<strong>in</strong>ally, the <strong>in</strong>teractionmay be nonl<strong>in</strong>ear <strong>and</strong> thus not captured by a simple product term. 3In the JCP articles we reviewed, the specific nature of the <strong>in</strong>teractionrarely was specified a priori.Power of Tests of InteractionsAlthough hierarchical multiple regression appears to be thepreferred statistical method for exam<strong>in</strong><strong>in</strong>g moderator effects wheneither the predictor or the moderator variable (or both) is measuredon a cont<strong>in</strong>uous scale (Agu<strong>in</strong>is, 1995), concerns often have beenraised <strong>in</strong> the statistical literature about the low power of thismethod to detect true <strong>in</strong>teraction effects. Agu<strong>in</strong>is et al. (2001)showed that the power to detect <strong>in</strong>teraction effects <strong>in</strong> a typicalstudy is .20 to .34, much lower than the recommended level of .80.Low power is a particular problem <strong>in</strong> nonexperimental studies,which have much less power for detect<strong>in</strong>g <strong>in</strong>teraction effects th<strong>and</strong>o experiments (McClell<strong>and</strong> & Judd, 1993). All but one of thestudies that we reviewed <strong>in</strong> JCP was nonexperimental, <strong>and</strong> none ofthe authors reported the power of the test of the <strong>in</strong>teraction.Several factors have been identified that reduce the power oftests of <strong>in</strong>teractions. These factors, outl<strong>in</strong>ed next, should be taken<strong>in</strong>to consideration when design<strong>in</strong>g a study to test moderator effectsto <strong>in</strong>crease the chances of f<strong>in</strong>d<strong>in</strong>g significant <strong>in</strong>teractions whenthey exist. Otherwise, it is unclear whether the <strong>in</strong>teraction is notsignificant because the theory was wrong or the test of the <strong>in</strong>teractionlacked sufficient power. As discussed by Agu<strong>in</strong>is (1995),the importance of fully consider<strong>in</strong>g issues related to researchdesign <strong>and</strong> measurement before data are collected cannot beoverstated.Effect size for <strong>in</strong>teraction <strong>and</strong> overall effect size. To ensureadequate sample sizes to maximize the chances of detect<strong>in</strong>g significant<strong>in</strong>teraction effects, the size of the <strong>in</strong>teraction effect shouldbe estimated before data collection. To be specific, the effect sizefor the <strong>in</strong>teraction <strong>in</strong> a regression analysis is the amount of <strong>in</strong>crementalvariance expla<strong>in</strong>ed by the <strong>in</strong>teraction term after the firstordereffects have been controlled (i.e., the R 2 change associatedwith the step <strong>in</strong> which the <strong>in</strong>teraction term is added). Thus, thepert<strong>in</strong>ent research should be reviewed so that the expected effectsize can be estimated on the basis of what is typically found <strong>in</strong> theliterature. Generally, effect sizes for <strong>in</strong>teractions are small (Chapl<strong>in</strong>,1991), as was the case for the studies we reviewed <strong>in</strong> JCP.Accord<strong>in</strong>g to Cohen’s (1992) conventions, a small effect size <strong>in</strong>multiple regression corresponds to an R 2 value of .02. The samplesize needed to detect a moderator effect depends on the size of theeffect; if the <strong>in</strong>teraction effect is small, a relatively large sample isneeded for the effect to be significant. Methods for calculat<strong>in</strong>gneeded sample sizes are discussed <strong>in</strong> a later section, because thesample size needed for adequate power depends on several factorsother than the effect size of the <strong>in</strong>teraction.In addition to the size of the <strong>in</strong>teraction effect, the total effectsize (i.e., the amount of variance expla<strong>in</strong>ed by the predictor,moderator, <strong>and</strong> <strong>in</strong>teraction) should be estimated before data collection.Aga<strong>in</strong>, this is done by review<strong>in</strong>g the pert<strong>in</strong>ent literature todeterm<strong>in</strong>e how much variance typically is accounted for by thevariables <strong>in</strong>cluded <strong>in</strong> one’s model. Neither the <strong>in</strong>teraction effectsize nor the total effect size was estimated a priori <strong>in</strong> any of thestudies we reviewed <strong>in</strong> JCP. <strong>Moderator</strong> effects are best detected(i.e., tests have more power) when the relation between the predictor<strong>and</strong> outcome is substantial (Chapl<strong>in</strong>, 1991; Jaccard et al.,1990). However, moderators often are exam<strong>in</strong>ed when there areunexpectedly weak relations between a predictor <strong>and</strong> outcome(Baron & Kenny, 1986; Chapl<strong>in</strong>, 1991), which further contributesto the low power of many tests of <strong>in</strong>teractions. One suggested wayto <strong>in</strong>crease power is to <strong>in</strong>crease the multiple correlation betweenthe full model <strong>and</strong> the outcome variable by <strong>in</strong>clud<strong>in</strong>g additionalsignificant predictors of the outcome variable <strong>in</strong> the model ascovariates (Jaccard & Wan, 1995).Choos<strong>in</strong>g variables. Keep<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d that decisions regard<strong>in</strong>gtests of <strong>in</strong>teractions should be based on theory, there are severalfactors to consider with regard to choos<strong>in</strong>g predictor, moderator,<strong>and</strong> outcome variables, each of which can <strong>in</strong>crease or decrease thepower of <strong>in</strong>teraction tests. Somewhat different issues arise withregard to categorical variables, cont<strong>in</strong>uous variables, <strong>and</strong> outcomevariables. Issues associated with each type of variable are discussed<strong>in</strong> turn.There are two issues to consider with regard to categoricalvariables. The first is that unequal sample sizes across groupsdecrease power (Agu<strong>in</strong>is, 1995; Agu<strong>in</strong>is & Stone-Romero, 1997;Alex<strong>and</strong>er & DeShon, 1994; Stone-Romero, Alliger, & Agu<strong>in</strong>is,1994). For example, with two groups, power decreases as thesample size proportions vary from .50/.50, regardless of the totalsample size. With a sample size of 180, the power to detect adifference of .40 <strong>in</strong> a correlation between two groups (e.g., acorrelation between a predictor <strong>and</strong> outcome of .2 for women <strong>and</strong>.6 for men) is more than .80 if the two groups are equal <strong>in</strong> size.However, if the sample size proportion is .10/.90 (e.g., 10% men<strong>and</strong> 90% women), power is about .40 (Stone-Romero et al., 1994).If the categorical variable is an experimental condition (e.g., typeof counsel<strong>in</strong>g <strong>in</strong>tervention), this can be addressed by assign<strong>in</strong>gequal numbers of <strong>in</strong>dividuals to each group. However, when thecategorical variable is not manipulated (e.g., gender or race),unequal groups are likely, <strong>and</strong> the effects on power need to beevaluated. Indeed, <strong>in</strong> some of the JCP studies reviewed, proportionswere as skewed as .07/.93, although this <strong>in</strong>equality was nevermentioned as a potential problem with regard to power.A second issue to consider is that even if sample sizes are equalacross groups, error variances across groups may be unequal(DeShon & Alex<strong>and</strong>er, 1996; Overton, 2001). In fact, one reviewrevealed that the assumption of homogeneous error variance isviolated about half of the time (Agu<strong>in</strong>is, Petersen, & Pierce, 1999).If sample sizes <strong>and</strong> error variances are unequal, power can be3 In this article, we focus on l<strong>in</strong>ear <strong>in</strong>teractions because they are the mostcommon form of <strong>in</strong>teraction tested (for <strong>in</strong>formation on nonl<strong>in</strong>ear <strong>in</strong>teractions,see Aiken & West, 1991, chap. 5; Cohen et al., 2003, chaps. 7 <strong>and</strong>9; Jaccard et al., 1990, chap. 4; Lub<strong>in</strong>ski & Humphreys, 1990; MacCallum& Mar, 1995).


MODERATOR AND MEDIATOR EFFECTS119either overestimated or underestimated, depend<strong>in</strong>g on whether thelarger or smaller sample has the larger error variance (for moredetails, see Agu<strong>in</strong>is & Pierce, 1998; Grissom, 2000; Overton,2001). In these cases, the results of multiple regression analysescannot be trusted, <strong>and</strong> alternative tests should be used. Agu<strong>in</strong>is etal. (1999) developed a program, available on the World WideWeb, that both tests the assumption of homogeneous error variance<strong>and</strong> calculates alternative tests. 4 These alternative tests make apractical difference when the error variance of one group is 1.5times larger than that of the other group (DeShon & Alex<strong>and</strong>er,1996; Overton, 2001). Only one study <strong>in</strong> our JCP review reportedwhether the assumption of homogeneity of error variance had beenmet.There also are two issues to consider when choos<strong>in</strong>g cont<strong>in</strong>uousvariables. One is the reliability of the measures. Measurementerror <strong>in</strong> <strong>in</strong>dividual variables (either predictors or moderators) dramaticallyreduces the reliability of the <strong>in</strong>teraction term constructedfrom them (Agu<strong>in</strong>is, 1995; Agu<strong>in</strong>is et al., 2001; Aiken & West,1991; Busemeyer & Jones, 1983; Jaccard et al., 1990). Lowerreliability of the <strong>in</strong>teraction term <strong>in</strong>creases its st<strong>and</strong>ard error <strong>and</strong>reduces the power of the test. For example, Aiken <strong>and</strong> Westshowed that the power of the test of the <strong>in</strong>teraction is reduced byup to half with reliabilities of .80 rather than 1.00. The secondissue concerns restriction <strong>in</strong> range, which also reduces power(Agu<strong>in</strong>is, 1995; Agu<strong>in</strong>is & Stone-Romero, 1997; McClell<strong>and</strong> &Judd, 1993). Range restriction means that not all <strong>in</strong>dividuals <strong>in</strong> apopulation have an equal probability of be<strong>in</strong>g selected for thesample (Agu<strong>in</strong>is, 1995). A simulation study exam<strong>in</strong><strong>in</strong>g the effectsof several variables on power showed that range restriction had aconsiderable effect (Agu<strong>in</strong>is & Stone-Romero, 1997). McClell<strong>and</strong><strong>and</strong> Judd (1993) provided specific recommendations regard<strong>in</strong>goversampl<strong>in</strong>g techniques that can be used to address this issue (seealso Cohen et al., 2003, pp. 298–299). In the 2001 JCP studies wereviewed, most measures had adequate reliability (.80 or higher),although range restriction was rarely mentioned as a possible issue<strong>and</strong> was difficult to assess because adequate <strong>in</strong>formation (e.g.,means, st<strong>and</strong>ard deviations, skewness, <strong>and</strong> ranges) was not alwaysprovided.A f<strong>in</strong>al consideration is choice of an outcome variable. Lowerreliability of an outcome variable reduces correlations with predictors,thus lower<strong>in</strong>g the overall R 2 value <strong>and</strong> the power of the test(Agu<strong>in</strong>is, 1995). Furthermore, if the outcome measure does nothave enough response options (i.e., is too “coarse”) to reflect the<strong>in</strong>teraction, there will be a loss <strong>in</strong> power (Russell & Bobko, 1992).The outcome measure has to have as many response options as theproduct of the response options of the predictor <strong>and</strong> moderatorvariables. For example, if both the predictor <strong>and</strong> moderator aremeasured with 5-po<strong>in</strong>t Likert scales, the true moderator effect willconta<strong>in</strong> 5 5 conceptually dist<strong>in</strong>ct latent responses. The outcomemeasure will thus need to have 25 response options (25-po<strong>in</strong>tscale) to capture the true moderator effect. Russell <strong>and</strong> Bobko alsonoted that summ<strong>in</strong>g responses to multiple Likert-type items withlimited response options (e.g., 5-po<strong>in</strong>t scale) does not provide thesame <strong>in</strong>crease <strong>in</strong> power as us<strong>in</strong>g an outcome measure with moreresponse options (e.g., 25-po<strong>in</strong>t scale) because participants are stillrespond<strong>in</strong>g to each item on the limited scale. In other words, thenumber of response options for the items determ<strong>in</strong>es coarsenessrather than the number of items on the scale. Because manyoutcome measures do not have sufficient response options, theeffects of scale coarseness on power may be difficult to avoid.Agu<strong>in</strong>is, Bommer, <strong>and</strong> Pierce (1996) developed a computer programthat adm<strong>in</strong>isters questionnaires by prompt<strong>in</strong>g respondents toclick along a l<strong>in</strong>e on a computer screen, thus allow<strong>in</strong>g for moreresponse options <strong>and</strong> <strong>in</strong>creas<strong>in</strong>g the accuracy of tests of moderatoreffects. However, if researchers prefer to use measures with establishedreliability <strong>and</strong> validity, they need to recognize that scalecoarseness may decrease power to detect the <strong>in</strong>teraction. Scalecoarseness was never mentioned as a factor that may affect power<strong>in</strong> the JCP studies we reviewed.There are several resources to which researchers can turn toestimate power. Jaccard et al. (1990) <strong>and</strong> Aiken <strong>and</strong> West (1991)provided tables for estimat<strong>in</strong>g power for <strong>in</strong>teractions. There is alsoan onl<strong>in</strong>e calculator that estimates the sample size needed toachieve a given level of power with categorical moderators thattakes <strong>in</strong>to account many of the factors just listed (e.g., sample sizeof each group, effect size of <strong>in</strong>teraction, <strong>and</strong> reliability of measures).5 This program can be used a priori to assess the effect ofvarious design decisions (e.g., to maximize power, is it better tosacrifice reliability or sample size? see Agu<strong>in</strong>is et al., 2001, forfurther details).In summary, to maximize the power of tests of moderatoreffects, researchers are encouraged to rely on theory when plann<strong>in</strong>gmoderator analyses, use an experimental design when appropriate,determ<strong>in</strong>e <strong>and</strong> obta<strong>in</strong> the sample size needed to achieveadequate power based on estimated effect sizes <strong>and</strong> other factors,attempt to collect equal numbers of participants for different levelsof a categorical variable, test the homogeneity of error varianceassumption <strong>and</strong> use appropriate tests if it is violated, choose highlyreliable cont<strong>in</strong>uous variables, obta<strong>in</strong> measures of cont<strong>in</strong>uous predictor<strong>and</strong> moderator variables that are normally distributed, <strong>and</strong>use outcome measures that are both reliable <strong>and</strong> sufficiently sensitive(i.e., have enough scale po<strong>in</strong>ts to capture the <strong>in</strong>teraction).Some (Agu<strong>in</strong>is, 1995; Jaccard & Wan, 1995; Judd et al., 1995;McClell<strong>and</strong> & Judd, 1993) also have suggested rais<strong>in</strong>g the alphalevel above the traditional .05 level to maximize power, withvarious caveats.Although these practices would improve the probability thatresearchers would f<strong>in</strong>d significant moderator effects when theyexist, they may not always be possible to implement. In addition,there may be times when other statistical procedures may be moreappropriate because of limitations <strong>in</strong>herent <strong>in</strong> ord<strong>in</strong>ary leastsquares regression. Most notably, several authors (e.g., Agu<strong>in</strong>is,1995; Aiken & West, 1991; Baron & Kenny, 1986; Busemeyer &Jones, 1983; Holmbeck, 1997; Jaccard et al., 1990) have encouragedthe use of SEM as a way to control for unreliability <strong>in</strong>measurement. SEM can be used to exam<strong>in</strong>e <strong>in</strong>teractions <strong>in</strong>volv<strong>in</strong>gboth categorical <strong>and</strong> cont<strong>in</strong>uous variables (for details on how toperform such analyses, see Bollen & Paxton, 1998; Holmbeck,1997; Jaccard & Wan, 1995, 1996; Kenny & Judd, 1984; Moulder& Alg<strong>in</strong>a, 2002; P<strong>in</strong>g, 1996; Schumacker & Marcoulides, 1998).When one variable is categorical, a multiple-group approach canbe used <strong>in</strong> which the relation between the predictor <strong>and</strong> outcome is4 The program can be found at http://members.aol.com/imsap/altmmr.html.5 This program can be found at http://www.math.montana.edu/rjboik/power.html.


120 FRAZIER, TIX, AND BARRONestimated separately for the multiple groups. Specifically, an unconstra<strong>in</strong>edmodel is compared with a constra<strong>in</strong>ed model (<strong>in</strong> whichthe paths are constra<strong>in</strong>ed to be equal across groups). If the unconstra<strong>in</strong>edmodel is a better fit to the data, there is evidence ofmoderation (i.e., different relations between the predictor <strong>and</strong>outcome across groups). However, SEM techniques for test<strong>in</strong>g<strong>in</strong>teractions between cont<strong>in</strong>uous variables are complex, <strong>and</strong> thereis little consensus regard<strong>in</strong>g which of several approaches is best(Marsh, 2002).Analyz<strong>in</strong>g the DataAfter the study has been designed <strong>and</strong> the data collected, thedata need to be analyzed. Steps <strong>in</strong>volved <strong>in</strong> analyz<strong>in</strong>g the data<strong>in</strong>clude creat<strong>in</strong>g or transform<strong>in</strong>g predictor <strong>and</strong> moderator variables(e.g., cod<strong>in</strong>g categorical variables, center<strong>in</strong>g or st<strong>and</strong>ardiz<strong>in</strong>g cont<strong>in</strong>uousvariables, or both), creat<strong>in</strong>g product terms, <strong>and</strong> structur<strong>in</strong>gthe equation.Represent<strong>in</strong>g Categorical Variables With Code VariablesIf either the predictor or moderator variable is categorical, thefirst step is to represent this variable with code variables. Thenumber of code variables needed depends on the number of levelsof the categorical variable, equal<strong>in</strong>g the number of levels of thevariable m<strong>in</strong>us one. For example, a counsel<strong>in</strong>g outcome study <strong>in</strong>which participants are r<strong>and</strong>omly assigned to one of three treatmentconditions (e.g., cognitive–behavioral therapy, <strong>in</strong>terpersonal therapy,<strong>and</strong> control group) would need two code variables to fullyrepresent the categorical variable of treatment type <strong>in</strong> the regressionequation. One of several cod<strong>in</strong>g systems can be chosen torepresent the categorical variable based on the specific questionsbe<strong>in</strong>g exam<strong>in</strong>ed (West et al., 1996). Specifically, dummy cod<strong>in</strong>g isused when comparisons with a control or base group are desired,effects cod<strong>in</strong>g is used when comparisons with the gr<strong>and</strong> mean aredesired, <strong>and</strong> contrast cod<strong>in</strong>g is used when comparisons betweenspecific groups are desired.Us<strong>in</strong>g the three-condition treatment study as an example,dummy cod<strong>in</strong>g would be used to compare the mean of eachtherapy group with the mean of the control group, effects cod<strong>in</strong>gwould be used to compare each of the group’s means with thegr<strong>and</strong> mean, <strong>and</strong> contrast cod<strong>in</strong>g would be used to compare orthogonalcomb<strong>in</strong>ations of the categorical variable (e.g., comparisonsof the mean of the two treatment groups with the mean of thecontrol group <strong>and</strong> comparisons of the means of each treatmentgroup with each other). We discuss this <strong>in</strong> more detail <strong>in</strong> ourexample, but it is critical to note here that the choice of cod<strong>in</strong>gsystem has very important implications for test<strong>in</strong>g <strong>and</strong> <strong>in</strong>terpret<strong>in</strong>geffects <strong>in</strong> equations <strong>in</strong>volv<strong>in</strong>g <strong>in</strong>teractions. We refer readers toWest et al. (1996), <strong>in</strong> particular, for a complete discussion of thedifferences among cod<strong>in</strong>g systems <strong>and</strong> practical guidel<strong>in</strong>es regard<strong>in</strong>gwhen <strong>and</strong> how to use them (see also Aiken & West, 1991;Cohen et al., 2003; Jaccard et al., 1990). In the JCP articles wereviewed, only dummy cod<strong>in</strong>g was used. However, as noted byCohen et al. (2003), “the dummy cod<strong>in</strong>g option that is so oftenconsidered the ‘default’ will frequently not be the optimal cod<strong>in</strong>gscheme” (p. 375).Center<strong>in</strong>g or St<strong>and</strong>ardiz<strong>in</strong>g Cont<strong>in</strong>uous VariablesThe next step <strong>in</strong> formulat<strong>in</strong>g the regression equation <strong>in</strong>volvescenter<strong>in</strong>g or st<strong>and</strong>ardiz<strong>in</strong>g predictor <strong>and</strong> moderator variables thatare measured on a cont<strong>in</strong>uous scale. 6 Several statisticians recommendthat these variables be centered (i.e., put <strong>in</strong>to deviation unitsby subtract<strong>in</strong>g their sample means to produce revised samplemeans of zero). This is because predictor <strong>and</strong> moderator variablesgenerally are highly correlated with the <strong>in</strong>teraction terms createdfrom them. Center<strong>in</strong>g reduces problems associated with multicoll<strong>in</strong>earity(i.e., high correlations) among the variables <strong>in</strong> the regressionequation (for further explanation, see Cohen et al., 2003;Cronbach, 1987; Jaccard et al., 1990; West et al., 1996). Theremay be further benefits to st<strong>and</strong>ardiz<strong>in</strong>g (i.e., z scor<strong>in</strong>g) rather thancenter<strong>in</strong>g cont<strong>in</strong>uous predictor <strong>and</strong> moderator variables (Aiken &West, 1991; Friedrich, 1982). For example, st<strong>and</strong>ardiz<strong>in</strong>g thesevariables makes it easier to plot significant moderator effectsbecause convenient representative values (i.e., the mean <strong>and</strong> 1st<strong>and</strong>ard deviation from the mean) can be substituted easily <strong>in</strong>to aregression equation to obta<strong>in</strong> predicted values for representativegroups when the st<strong>and</strong>ard deviations of these variables equal one(see Cohen et al., 2003). In addition, z scores are very easy tocreate with<strong>in</strong> st<strong>and</strong>ard statistical packages. St<strong>and</strong>ardiz<strong>in</strong>g alsomakes it easier to <strong>in</strong>terpret the effects of the predictor <strong>and</strong> moderator,as we discuss later. In contrast to these recommendations,only one of the JCP articles reviewed reported us<strong>in</strong>g centered orst<strong>and</strong>ardized cont<strong>in</strong>uous variables.Creat<strong>in</strong>g Product TermsAfter code variables have been created to represent any categoricalvariables <strong>and</strong> variables measured on a cont<strong>in</strong>uous scalehave been centered or st<strong>and</strong>ardized, product terms need to becreated that represent the <strong>in</strong>teraction between the predictor <strong>and</strong>moderator. To form product terms, one simply multiplies togetherthe predictor <strong>and</strong> moderator variables us<strong>in</strong>g the newly codedcategorical variables or centered/st<strong>and</strong>ardized cont<strong>in</strong>uous variables(Aiken & West, 1991; Cohen et al., 2003; Jaccard et al., 1990;West et al., 1996). A product term needs to be created for eachcoded variable (e.g., if there is one coded variable for a categoricalvariable with two levels, there is one <strong>in</strong>teraction term; if there aretwo coded variables for a categorical variable with three levels,there are two <strong>in</strong>teraction terms). This product term does not needto be centered or st<strong>and</strong>ardized.Structur<strong>in</strong>g the EquationAfter product terms have been created, everyth<strong>in</strong>g should be <strong>in</strong>place to structure a hierarchical multiple regression equation us<strong>in</strong>gst<strong>and</strong>ard statistical software to test for moderator effects. To dothis, one enters variables <strong>in</strong>to the regression equation through aseries of specified blocks or steps (Aiken & West, 1991; Cohen etal., 2003; Jaccard et al., 1990; West et al., 1996). The first step6 Whereas there are benefits to center<strong>in</strong>g or st<strong>and</strong>ardiz<strong>in</strong>g predictorvariables <strong>and</strong> moderator variables that are measured on a cont<strong>in</strong>uous scale,there typically is no reason to do so with code variables represent<strong>in</strong>gcategorical variables or cont<strong>in</strong>uous outcome variables (Aiken & West,1991; Cohen et al., 2003; Jaccard et al., 1990; West et al., 1996).


122 FRAZIER, TIX, AND BARRONal. described how to calculate semipartial correlations, which arenot always provided by statistical programs <strong>in</strong> their st<strong>and</strong>ardoutput, <strong>and</strong> provided tables to calculate the power associated withtests of their significance.If the <strong>in</strong>teraction term is not significant, the researcher mustdecide whether to remove the term from the model so that thefirst-order effects are not conditional effects. Aiken <strong>and</strong> West(1991, pp. 103–105) reviewed the issues associated with thisdecision <strong>and</strong> ultimately recommended keep<strong>in</strong>g the nonsignificant<strong>in</strong>teraction term <strong>in</strong> the model if there are strong theoretical reasonsfor expect<strong>in</strong>g an <strong>in</strong>teraction <strong>and</strong> remov<strong>in</strong>g the <strong>in</strong>teraction if there isnot a strong theoretical rationale for the moderator effect (see alsoCohen et al., 2003).Interpret<strong>in</strong>g Significant <strong>Moderator</strong> <strong>Effects</strong>Once it has been determ<strong>in</strong>ed that a significant moderator effectexists, it is important to <strong>in</strong>spect its particular form. There are twoways to do this. The first is to compute predicted values of theoutcome variable for representative groups, such as those whoscore at the mean <strong>and</strong> 1 st<strong>and</strong>ard deviation above <strong>and</strong> below themean on the predictor <strong>and</strong> moderator variables (Aiken & West,1991; Cohen et al., 2003; Holmbeck, 1997; West et al., 1996). Thepredicted values obta<strong>in</strong>ed from this process then may be used tocreate a figure summariz<strong>in</strong>g the form of the moderator effect. Thesecond method is to test the statistical significance of the slopes ofthe simple regression l<strong>in</strong>es represent<strong>in</strong>g relations between thepredictor <strong>and</strong> the outcome at specific values of the moderatorvariable (for further details, see Aiken & West, 1991; Cohen et al.,2003; Jaccard et al., 1990; West et al., 1996). Unlike just plott<strong>in</strong>gmeans, test<strong>in</strong>g the simple slopes provides <strong>in</strong>formation regard<strong>in</strong>gthe significance of the relations between the predictor <strong>and</strong> outcomeat different levels of the moderator. Confidence <strong>in</strong>tervals for thesimple slopes also can be calculated (Cohen et al., 2003). Amongthe JCP articles we reviewed, only one provided plots of <strong>in</strong>teractions(which were mislabeled), <strong>and</strong> none presented tests of simpleslopes.Additional Issues to Consider When Exam<strong>in</strong><strong>in</strong>g<strong>Moderator</strong> <strong>Effects</strong>Hav<strong>in</strong>g discussed the basics of <strong>in</strong>vestigat<strong>in</strong>g moderator effectsus<strong>in</strong>g hierarchical multiple regression techniques, we now turn tosome additional issues that may be important to consider: (a)<strong>in</strong>clud<strong>in</strong>g covariates <strong>in</strong> regression equations exam<strong>in</strong><strong>in</strong>g moderatoreffects, (b) exam<strong>in</strong><strong>in</strong>g multiple moderator effects, <strong>and</strong> (c) exam<strong>in</strong><strong>in</strong>gthree-way (<strong>and</strong> higher) <strong>in</strong>teractions.Includ<strong>in</strong>g Covariates <strong>in</strong> Regression Equations Exam<strong>in</strong><strong>in</strong>g<strong>Moderator</strong> <strong>Effects</strong>In addition to variables needed to test a moderator effect, someresearchers may want to consider <strong>in</strong>clud<strong>in</strong>g covariates to controlfor the effects of other variables, <strong>in</strong>crease the overall R 2 to <strong>in</strong>creasepower, or estimate change <strong>in</strong> an outcome variable over time. If thisis to be done, covariates need to be entered <strong>in</strong> the first step of theregression equation, followed by the predictor variable, moderatorvariable, <strong>and</strong> product terms <strong>in</strong> subsequent steps, as discussedearlier. In addition, as emphasized by Cohen <strong>and</strong> Cohen (1983),<strong>in</strong>teractions between covariates <strong>and</strong> other variables <strong>in</strong> the regressionmodel should be tested to determ<strong>in</strong>e whether covariates actconsistently across levels of the other variables (i.e., have parallelslopes). 8 This may be done by add<strong>in</strong>g a f<strong>in</strong>al step conta<strong>in</strong><strong>in</strong>g<strong>in</strong>teractions between the covariates <strong>and</strong> all other variables (<strong>in</strong>clud<strong>in</strong>gproduct terms). If the omnibus F test represent<strong>in</strong>g this entirestep is not significant, this step can be dropped from the model. Ifthe overall step is significant, the t tests related to specific <strong>in</strong>teractionscan be <strong>in</strong>spected, potentially uncover<strong>in</strong>g a moderator effectthat can be <strong>in</strong>vestigated <strong>in</strong> future research (Aiken & West, 1991;Cohen & Cohen, 1983). In the JCP articles we reviewed, whencovariates were added to regression models conta<strong>in</strong><strong>in</strong>g <strong>in</strong>teractions,<strong>in</strong>teractions with covariates were never assessed.Exam<strong>in</strong><strong>in</strong>g Multiple <strong>Moderator</strong> <strong>Effects</strong>Although we have been focus<strong>in</strong>g on models with one moderatorvariable, some researchers may want to consider <strong>in</strong>vestigat<strong>in</strong>gmultiple moderator effects. However, perform<strong>in</strong>g a large numberof statistical tests <strong>in</strong> this manner will lead to an <strong>in</strong>flated Type Ierror rate (Cohen et al., 2003). To help control for this type oferror, all of the moderator effects be<strong>in</strong>g considered may be entered<strong>in</strong> a s<strong>in</strong>gle step after all of the predictor <strong>and</strong> moderator variablesfrom which they are based have been entered <strong>in</strong> previous steps.The significance of the omnibus F test represent<strong>in</strong>g the varianceexpla<strong>in</strong>ed by this entire step then can determ<strong>in</strong>e whether it shouldbe elim<strong>in</strong>ated from the model (if the omnibus test is not significant)or whether t tests represent<strong>in</strong>g specific moderator effectsshould be <strong>in</strong>spected for statistical significance (if the omnibus testis significant; Aiken & West, 1991). The squared semipartialcorrelations associated with each <strong>in</strong>teraction also can be calculatedto determ<strong>in</strong>e the amount of variance <strong>in</strong> the outcome attributable toeach <strong>in</strong>teraction term (Cohen et al., 2003). Significant moderatoreffects then can be explored <strong>in</strong> the manner discussed earlier. Whenmultiple moderators were tested <strong>in</strong> the JCP articles we reviewed,<strong>in</strong>flated Type I error was never mentioned or addressed.Exam<strong>in</strong><strong>in</strong>g Higher Order InteractionsWe focus <strong>in</strong> this article on procedures for test<strong>in</strong>g two-way<strong>in</strong>teractions because they are the most common form of <strong>in</strong>teractionhypothesized <strong>and</strong> tested <strong>in</strong> counsel<strong>in</strong>g psychology research. Indeed,higher order <strong>in</strong>teractions were never tested <strong>in</strong> the JCParticles we reviewed. However, <strong>in</strong>teractions may <strong>in</strong>volve three (ormore) variables. To use the example we work through later, therelation between social support <strong>and</strong> depression may depend on ageas well as gender. Researchers <strong>in</strong>terested <strong>in</strong> test<strong>in</strong>g higher order<strong>in</strong>teractions are referred to Aiken <strong>and</strong> West (1991, chap. 4), whoprovided a thorough discussion of the procedures for test<strong>in</strong>g <strong>and</strong><strong>in</strong>terpret<strong>in</strong>g three-way <strong>in</strong>teractions (see also Cohen et al., 2003).Some caveats regard<strong>in</strong>g higher order <strong>in</strong>teractions should be mentioned,however. For example, three-way (<strong>and</strong> higher order) <strong>in</strong>teractionsare rarely of primary <strong>in</strong>terest because our theories are notsufficiently complex (Cohen & Cohen, 1983). As mentioned be-8 Because the covariates will be entered <strong>in</strong>to <strong>in</strong>teraction terms, it isuseful to center or st<strong>and</strong>ardize them. Even if they are not entered <strong>in</strong>to<strong>in</strong>teraction terms, Cohen et al. (2003) recommended center<strong>in</strong>g them to beconsistent with other predictors <strong>in</strong> the model.


MODERATOR AND MEDIATOR EFFECTS123fore, all tests of moderation should be based on strong theory, withthe nature of the <strong>in</strong>teraction specified a priori. No <strong>in</strong>teractionsshould be <strong>in</strong>cluded unless they have substantive theoretical supportbecause as the number of hypotheses tested <strong>in</strong>creases, so do therisks of Type I <strong>and</strong> Type II error (Chapl<strong>in</strong>, 1991; Cohen & Cohen,1983; McClell<strong>and</strong> & Judd, 1993). Also, measurement error is aneven bigger problem for three-way than for two-way <strong>in</strong>teractions(Busemeyer & Jones, 1983) because the reliability of the productterm is the product of the reliability of the three measures.ConclusionsThere are many issues to consider <strong>in</strong> test<strong>in</strong>g moderation <strong>and</strong> manyissues about which counsel<strong>in</strong>g psychology researchers seem to beunaware. Indeed, few researchers <strong>in</strong> the studies we reviewed testedmoderation, <strong>and</strong> those who did used methods other than multipleregression. For example, it was common for researchers to dichotomizecont<strong>in</strong>uous variables <strong>and</strong> use other analytic approaches (such asANOVA), result<strong>in</strong>g <strong>in</strong> a loss of power <strong>and</strong> <strong>in</strong>formation. Among thosewho did use regression to test moderation, little or no effort was madeto estimate power or to address issues that may lower power (e.g.,unequal sample sizes). Furthermore, there appeared to be little awarenessof issues <strong>in</strong>volved <strong>in</strong> <strong>in</strong>terpret<strong>in</strong>g results from moderationalanalyses, such as the need to center cont<strong>in</strong>uous variables.Example: <strong>Test<strong>in</strong>g</strong> <strong>Moderator</strong> <strong>Effects</strong> Us<strong>in</strong>g MultipleRegressionTo illustrate how moderator effects may be <strong>in</strong>vestigated throughthe use of multiple regression, we provide a step-by-step exampleus<strong>in</strong>g simulated data that meet the criteria outl<strong>in</strong>ed previously. Inthe mediation example provided later, actual data are used toillustrate some of the issues that arise when us<strong>in</strong>g real, versussimulated, data.Design<strong>in</strong>g the StudyRecall that it often is useful to look for moderators when thereare unexpectedly weak or <strong>in</strong>consistent relations between a predictor<strong>and</strong> an outcome across studies. One example of this is therelations between social support <strong>and</strong> mental health <strong>in</strong>dicators (e.g.,depression), which often are not as strong as one might expect(e.g., see Lakey & Drew, 1997). Thus, perhaps it is the case thatsocial support is more strongly related to depression for somepeople than for others. On the basis of exist<strong>in</strong>g theory <strong>and</strong> research,one possible moderator of the relation between socialsupport <strong>and</strong> depression is gender. Specifically, because relationshipsgenerally are more important to women than to men (Cross& Madson, 1997), the relation between social support <strong>and</strong> depressionmay be stronger for women than for men. In our example, wemeasured social support <strong>in</strong> terms of unhelpful social support (e.g.,m<strong>in</strong>imiz<strong>in</strong>g the event), which tends to be more strongly related todepression than is helpful support (e.g., Frazier, Tix, & Barnett,2003). Thus, the hypothesis tested <strong>in</strong> this example, based onprevious research <strong>and</strong> theory, is that unhelpful support behaviorswill be positively related to depression for both men <strong>and</strong> womenbut that this relation will be stronger for women than for men.We also took <strong>in</strong>to account the factors mentioned earlier thataffect power <strong>and</strong> <strong>in</strong>corporated several design features to maximizepower. For example, we generated the data such that the range <strong>in</strong>the social support measure (the predictor) was not restricted <strong>and</strong>the reliability coefficients for the social support <strong>and</strong> depression(the outcome) measures were good (i.e., alpha coefficients of .80).The social support measure conta<strong>in</strong>ed 20 items rated on a 5-po<strong>in</strong>tLikert-type scale. We verified that the homogeneity of error varianceassumption was not violated us<strong>in</strong>g an onl<strong>in</strong>e calculator (seeAgu<strong>in</strong>is et al., 1999, <strong>and</strong> Footnote 4). With regard to effect size, wegenerated the data such that there would be a difference of about.30 <strong>in</strong> the correlation between unhelpful support <strong>and</strong> depression forwomen (r .39) <strong>and</strong> men (r .10). To estimate the sample sizeneeded to have sufficient power (.80) to detect this effect, weentered these parameters <strong>in</strong>to an onl<strong>in</strong>e power calculator (seeAgu<strong>in</strong>is et al., 2001, <strong>and</strong> Footnote 5) <strong>and</strong> determ<strong>in</strong>ed that wewould need a sample of about 160 <strong>in</strong> each group. The data setgenerated consisted of equal numbers of men <strong>and</strong> women (165 <strong>in</strong>each group), with an actual power of .83 to detect the specifieddifferences. Our outcome measure consisted of 20 items rated ona 10-po<strong>in</strong>t Likert scale. Because there were 10 response options foreach item on the outcome measure, it was sensitive enough (i.e.,not too coarse) to capture the <strong>in</strong>teraction between social support (5response options) <strong>and</strong> gender (2 response options). (Recall that thenumber of response options for the outcome variable should begreater than or equal to the product of the number of responseoptions for the predictor <strong>and</strong> moderator variables.) F<strong>in</strong>ally, the oneaspect that reduced our power was the use of a nonexperimentaldesign. Although experimental designs have more power, ourexample <strong>in</strong>volved a correlational study, because this is the designmost often used to exam<strong>in</strong>e moderator effects <strong>in</strong> counsel<strong>in</strong>g psychologyresearch.Analyz<strong>in</strong>g the DataFirst, we st<strong>and</strong>ardized the unhelpful support variable so that ithad a mean of 0 <strong>and</strong> a st<strong>and</strong>ard deviation of 1. Next, we needed todecide which form of cod<strong>in</strong>g to use for our categorical moderatorvariable (gender). We chose effects cod<strong>in</strong>g because we wanted to<strong>in</strong>terpret the first-order effects of gender <strong>and</strong> social support asaverage effects, as <strong>in</strong> ANOVA (see Cohen et al., 2003, <strong>and</strong> Westet al., 1996, for more details). More specifically, if one codesgender us<strong>in</strong>g effects cod<strong>in</strong>g (i.e., codes of 1 for men <strong>and</strong> 1 forwomen), <strong>and</strong> if the social support measure has been st<strong>and</strong>ardizedso that it has a mean of 0, the first-order effect of gender is theaverage relation between gender <strong>and</strong> depression, the first-ordereffect of social support is the average relation between socialsupport <strong>and</strong> depression, <strong>and</strong> the <strong>in</strong>tercept is the average depressionscore <strong>in</strong> the sample. Because we had equal numbers of men <strong>and</strong>women <strong>in</strong> our sample, weighted <strong>and</strong> unweighted effects cod<strong>in</strong>gwould give the same results. West et al. provided guidel<strong>in</strong>esregard<strong>in</strong>g when to use weighted versus unweighted effects cod<strong>in</strong>gif sample sizes are unequal. Because there were only two categoriesfor gender, we needed only one code variable. 9 The f<strong>in</strong>al stepwas to create the <strong>in</strong>teraction term (the product of the gender code<strong>and</strong> the z-scored unhelpful support measure). To perform the9 Readers are referred to the follow<strong>in</strong>g sources for guidance on analyz<strong>in</strong>gcategorical data with more than two levels: Aiken <strong>and</strong> West (1991), Cohenet al. (2003), Jaccard et al. (1990), <strong>and</strong> West et al. (1996).


124 FRAZIER, TIX, AND BARRONTable 1<strong>Test<strong>in</strong>g</strong> <strong>Moderator</strong> <strong>Effects</strong> Us<strong>in</strong>g Hierarchical Multiple RegressionStep <strong>and</strong> variable B SE B 95% CI R 2<strong>Effects</strong> cod<strong>in</strong>g (men coded 1, women coded 1)Step 1Gender 0.11 0.07 0.25, 0.02 .09Unhelpful social support (z score) 0.32 0.07 0.18, 0.45 .25** .07**Step 2Gender Unhelpful Social Support 0.20 0.07 0.06, 0.33 .16* .02*Dummy cod<strong>in</strong>g (men coded 0, women coded 1)Step 1Gender 0.23 0.13 0.49, 0.03 .09Unhelpful social support (z score) 0.12 0.09 0.06, 0.30 .10 .07**Step 2Gender Unhelpful Social Support 0.39 0.13 0.13, 0.65 .22* .02*Dummy cod<strong>in</strong>g (women coded 0, men coded 1)Step 1Gender 0.23 0.13 0.03, 0.49 .09Unhelpful social support (z score) 0.51 0.10 0.32, 0.70 .40** .07**Step 2Gender Unhelpful Social Support 0.39 0.13 0.65, 0.13 .22* .02*Note. CI confidence <strong>in</strong>terval.* p .01. ** p .001.analysis, we regressed depression on gender <strong>and</strong> the z-scoredsupport measure <strong>in</strong> the first step <strong>and</strong> the <strong>in</strong>teraction betweengender <strong>and</strong> the z-scored support measure <strong>in</strong> the second step. Theoutput is presented <strong>in</strong> Table 1.Interpret<strong>in</strong>g the ResultsFirst, we obta<strong>in</strong>ed descriptive statistics to verify that the gendervariable was coded correctly <strong>and</strong> that the social support variablehad a mean of 0 <strong>and</strong> a st<strong>and</strong>ard deviation of 1. 10 We also obta<strong>in</strong>edcorrelations among all variables to make sure that, as a result ofst<strong>and</strong>ardiz<strong>in</strong>g cont<strong>in</strong>uous variables, the <strong>in</strong>teraction term <strong>and</strong> itscomponents were not too highly correlated. As mentioned, multicoll<strong>in</strong>earitycan cause both <strong>in</strong>terpretational <strong>and</strong> computationalproblems.Look<strong>in</strong>g at the output for effects cod<strong>in</strong>g <strong>in</strong> Table 1, the unst<strong>and</strong>ardizedregression coefficient for gender was 0.11, which wasnot significant at the conventional .05 level ( p .09). The unst<strong>and</strong>ardizedregression coefficient for unhelpful social supportwas 0.32 ( p .0001), mean<strong>in</strong>g that there was a significantpositive relation between unhelpful support <strong>and</strong> depression <strong>in</strong> thesample. Because gender was coded by means of effects cod<strong>in</strong>g,<strong>and</strong> the support variable was st<strong>and</strong>ardized, we could <strong>in</strong>terpret thisfirst-order effect of social support as an average effect. This wouldnot be the case if another form of cod<strong>in</strong>g had been used. Theunst<strong>and</strong>ardized regression coefficient for the <strong>in</strong>teraction term was.20 ( p .004). The R 2 change associated with the <strong>in</strong>teraction termwas .02. In other words, the <strong>in</strong>teraction between unhelpful socialsupport <strong>and</strong> gender expla<strong>in</strong>ed an additional 2% of the variance <strong>in</strong>depression scores over <strong>and</strong> above the 7% expla<strong>in</strong>ed by the firstordereffects of social support <strong>and</strong> gender alone.To underst<strong>and</strong> the form of the <strong>in</strong>teraction, it was necessary toexplore it further. As mentioned, one way is to plot predictedvalues for the outcome variable (depression) for representativegroups. A common practice (recommended by Cohen et al., 2003)is to choose groups at the mean <strong>and</strong> at low (1 SD from the mean)<strong>and</strong> high (1 SD from the mean) values of the cont<strong>in</strong>uous variable.Here we plotted scores for men <strong>and</strong> women at the mean <strong>and</strong> at low(1 SD) <strong>and</strong> high (1 SD) levels of unhelpful social support (seeFigure 2). (If we had two cont<strong>in</strong>uous variables, we could plotscores for participants represent<strong>in</strong>g the four comb<strong>in</strong>ations of low<strong>and</strong> high scores on the two variables.) Predicted values wereobta<strong>in</strong>ed for each group by multiply<strong>in</strong>g the respective unst<strong>and</strong>ardizedregression coefficients for each variable by the appropriatevalue (e.g., 1, 1 for st<strong>and</strong>ardized variables) for each variable <strong>in</strong>the equation. 11 For example, to get the predicted score for menwho score 1 st<strong>and</strong>ard deviation above the mean on unhelpful socialsupport, we multiplied the unst<strong>and</strong>ardized coefficient for gender(0.11) by 1 (the code for men), multiplied the unst<strong>and</strong>ardizedcoefficient for unhelpful support (B 0.32) by 1 (the code for highlevels of unhelpful social support), multiplied the unst<strong>and</strong>ardizedcoefficient for the <strong>in</strong>teraction term (B 0.20) by the product of thegender <strong>and</strong> unhelpful support codes (<strong>in</strong> this case, 1 1 1),<strong>and</strong> added the constant (5.10) for a predicted value on the depressionmeasure of 5.34. The group with the lowest level of depres-10 A scale may no longer be properly st<strong>and</strong>ardized if there are miss<strong>in</strong>gdata <strong>and</strong> the sample from which the z score was created differs from thesample for the regression analyses.11 An Excel file created to calculate these predicted values is availablefrom Patricia A. Frazier.


MODERATOR AND MEDIATOR EFFECTS125Figure 2. Plot of significant Gender Unhelpful Social Support (ss)<strong>in</strong>teraction. Solid diamonds men; solid squares women.sion was women with low levels of unhelpful support (Y 4.48),whose depression score was lower than that of men with low levelsof unhelpful support (Y 5.10). Men (Y 5.34) <strong>and</strong> women (Y 5.50) <strong>in</strong> the groups at high levels of unhelpful support had verysimilar depression scores, as did men (Y 5.22) <strong>and</strong> women (Y 4.99) with mean levels of unhelpful support.Another approach is to test the significance of the slopes foreach group. The significant <strong>in</strong>teraction term tells us that the slopesdiffer from each other but not whether each slope differs fromzero. For example, look<strong>in</strong>g at Figure 2, we can formally testwhether the slope represent<strong>in</strong>g the relation between social support<strong>and</strong> depression for men significantly differs from zero <strong>and</strong> whetherthe slope for women significantly differs from zero. To test thesimple slopes for each group, we needed to conduct two additionalregression analyses. Although these regressions were similar tothose just reported, we recoded gender us<strong>in</strong>g dummy cod<strong>in</strong>g. Inone analysis gender was coded so that men received a value of 0,<strong>and</strong> <strong>in</strong> one gender was coded so that women received a value of 0(see Table 1).As discussed earlier, when regression equations conta<strong>in</strong> <strong>in</strong>teractionterms, the regression coefficient for the predictor representsthe relation between the predictor <strong>and</strong> outcome when the moderatorhas a value of 0. Thus, with gender dummy coded <strong>and</strong> mencoded as 0, the regression coefficient for unhelpful social supportrepresents the relation between unhelpful support <strong>and</strong> depressionfor men. With gender dummy coded <strong>and</strong> women coded as 0, theregression coefficient for unhelpful social support is the relationbetween unhelpful support <strong>and</strong> depression for women. When thesetwo regressions were performed, there was a significant positiveslope for women (B 0.51, p .0001) but not for men (B 0.12,p .20). As can be seen <strong>in</strong> Table 1, the 95% confidence <strong>in</strong>tervalfor the simple slope for men <strong>in</strong>cluded zero, which means that wecould not reject the null hypothesis that this slope differed fromzero. If we had not <strong>in</strong>cluded gender as a moderator, we would haveconcluded that unhelpful social support had a small to mediumsizedrelation with depression (B .32), which would havemasked the fact that the relation was much stronger <strong>in</strong> women(B .51) than <strong>in</strong> men (B .12). 12 These analyses also illustratehow the cod<strong>in</strong>g of the gender variable changes the regressioncoefficients for the social support variable (but not the varianceaccounted for by the <strong>in</strong>teraction term; see Cohen et al., 2003).Now that we have found a significant <strong>in</strong>teraction between gender<strong>and</strong> unhelpful support <strong>in</strong> predict<strong>in</strong>g depression, what do we do?One possibility is to exam<strong>in</strong>e what accounts for the gender difference<strong>in</strong> the relation between unhelpful support <strong>and</strong> depression.That is, why is unhelpful support more related to depression forwomen than for men? Earlier we hypothesized that a possiblereason is that relationships tend to be more important for womenthan for men (Cross & Madson, 1997). Thus, <strong>in</strong> a future study wecould assess whether differences <strong>in</strong> the importance of relationshipsmediate the <strong>in</strong>teraction between gender <strong>and</strong> social support <strong>in</strong>predict<strong>in</strong>g depression. This would be an example of mediatedmoderation (Baron & Kenny, 1986). That gender moderates therelations between unhelpful support <strong>and</strong> depression also may haveimplications for <strong>in</strong>terventions (i.e., <strong>in</strong>terventions that improve socialrelationships may be more helpful for women than for men).MEDIATOR EFFECTSWe now turn to a description of test<strong>in</strong>g mediator effects, us<strong>in</strong>gthe same framework that we applied to the description of test<strong>in</strong>gmoderator effects. In this section, we first review the steps forestablish<strong>in</strong>g mediation <strong>and</strong> then describe issues to consider <strong>in</strong>design<strong>in</strong>g the study, analyz<strong>in</strong>g the data, <strong>and</strong> <strong>in</strong>terpret<strong>in</strong>g the results.As was the case with moderation, we also note issues aboutwhich there appears to be confusion <strong>in</strong> counsel<strong>in</strong>g psychologyresearch, on the basis of a review of studies published <strong>in</strong> JCP <strong>in</strong>2001 that reported mediational analyses. Of the 54 articles thatappeared <strong>in</strong> 2001, 10 (19%) conta<strong>in</strong>ed a test of mediation or<strong>in</strong>direct effects. 13 As before, the mediational analyses described <strong>in</strong>the identified studies are discussed on a general level so thatparticular studies <strong>and</strong> authors are not s<strong>in</strong>gled out. In the f<strong>in</strong>alsection, we provide a step-by-step example to guide the readerthrough perform<strong>in</strong>g mediation analyses us<strong>in</strong>g multiple regression.Guide to <strong>Test<strong>in</strong>g</strong> Mediation <strong>Effects</strong> <strong>in</strong> MultipleRegressionAccord<strong>in</strong>g to MacK<strong>in</strong>non, Lockwood, Hoffman, West, <strong>and</strong>Sheets (2002), the most common method for test<strong>in</strong>g mediation <strong>in</strong>psychological research was developed by Kenny <strong>and</strong> his colleagues(Baron & Kenny, 1986; Judd & Kenny, 1981; Kenny,Kashy, & Bolger, 1998). Accord<strong>in</strong>g to this method, there are foursteps (performed with three regression equations) <strong>in</strong> establish<strong>in</strong>gthat a variable (e.g., social support) mediates the relation betweena predictor variable (e.g., counsel<strong>in</strong>g condition) <strong>and</strong> an outcomevariable (e.g., well-be<strong>in</strong>g; see Figure 3A <strong>and</strong> Figure 3B). The firststep is to show that there is a significant relation between thepredictor <strong>and</strong> the outcome (see Path c <strong>in</strong> Figure 3A). The secondstep is to show that the predictor is related to the mediator (seePath a <strong>in</strong> Figure 3B). The third step is to show that the mediator(e.g., social support) is related to the outcome variable (e.g.,12 Aiken <strong>and</strong> West (1991, pp. 14–22) described the procedures fortest<strong>in</strong>g simple slopes when both the predictor <strong>and</strong> moderator are cont<strong>in</strong>uousvariables.13 The terms mediated effects <strong>and</strong> <strong>in</strong>direct effects are typically used<strong>in</strong>terchangeably. Accord<strong>in</strong>g to MacK<strong>in</strong>non et al. (2002), mediation is themore common term <strong>in</strong> psychology, whereas <strong>in</strong>direct effect comes from thesociological literature.


126 FRAZIER, TIX, AND BARRONFigure 3.well-be<strong>in</strong>g). This is Path b <strong>in</strong> Figure 3B, <strong>and</strong> it is estimatedcontroll<strong>in</strong>g for the effects of the predictor on the outcome. Thef<strong>in</strong>al step is to show that the strength of the relation between thepredictor <strong>and</strong> the outcome is significantly reduced when the mediatoris added to the model (compare Path c <strong>in</strong> Figure 3A withPath c <strong>in</strong> Figure 3B). If social support is a complete mediator, therelation between counsel<strong>in</strong>g condition <strong>and</strong> well-be<strong>in</strong>g will notdiffer from zero after social support is <strong>in</strong>cluded <strong>in</strong> the model. Ifsocial support is a partial mediator, which is more likely, therelation between counsel<strong>in</strong>g condition <strong>and</strong> well-be<strong>in</strong>g will besignificantly smaller when social support is <strong>in</strong>cluded but will stillbe greater than zero.Design<strong>in</strong>g a Study to Test <strong>Mediator</strong> <strong>Effects</strong>In this section, we discuss four issues to consider <strong>in</strong> design<strong>in</strong>gstudies to test mediation: (a) the relation between the predictor <strong>and</strong>outcome variable, (b) choos<strong>in</strong>g mediator variables, (c) establish<strong>in</strong>gcausation, <strong>and</strong> (d) factors that affect the power of the test ofmediation.Predictor–Outcome RelationDiagram of paths <strong>in</strong> mediation models.As mentioned, accord<strong>in</strong>g to the model popularized by Kenny<strong>and</strong> colleagues (Baron & Kenny, 1986; Judd & Kenny, 1981;Kenny et al., 1998), the first step <strong>in</strong> the process of test<strong>in</strong>g mediationis to establish that there is a significant relation between thepredictor <strong>and</strong> outcome variable. That is, before one looks forvariables that mediate an effect, there should be an effect tomediate. Therefore, <strong>in</strong> design<strong>in</strong>g a mediational study, one generallyshould beg<strong>in</strong> with predictor <strong>and</strong> outcome variables that areknown to be significantly associated on the basis of prior research.As mentioned previously, the ma<strong>in</strong> purpose of mediational analysesis to exam<strong>in</strong>e why an association between a predictor <strong>and</strong>outcome exists.There are, however, situations <strong>in</strong> which a researcher might wantto look for evidence of mediation <strong>in</strong> the absence of a relationbetween a predictor <strong>and</strong> an outcome. In fact, Kenny et al. (1998)stated that this first step is not required (although a significantpredictor–outcome relationship is implied if the predictor is relatedto the mediator <strong>and</strong> the mediator is related to the outcome).One example is a situation <strong>in</strong> which a treatment does not appear tobe effective (i.e., no effect of predictor on outcome) because thereare multiple mediators produc<strong>in</strong>g <strong>in</strong>consistent effects (Coll<strong>in</strong>s,Graham, & Flaherty, 1998; MacK<strong>in</strong>non, 2000; MacK<strong>in</strong>non, Krull,& Lockwood, 2000). For example, suppose an evaluation of a rapeprevention program for men showed no differences between an<strong>in</strong>tervention <strong>and</strong> a control group on an outcome measure of attitudestoward women. It may be that the <strong>in</strong>tervention made menmore empathic toward women, which was associated with positivechanges <strong>in</strong> attitudes toward women. However, the <strong>in</strong>terventionmight also have made men more defensive, which might be associatedwith negative changes <strong>in</strong> attitudes toward women. Theeffects of these two mediators could cancel each other out, produc<strong>in</strong>ga nonsignificant <strong>in</strong>tervention effect. In this case, it wouldbe useful to perform mediational analyses <strong>in</strong> the absence of apredictor–outcome relation to identify these <strong>in</strong>consistent mediators.Of course, to do such analyses these mediators would need tohave been assessed, which often is not the case (MacK<strong>in</strong>non,1994). MacK<strong>in</strong>non et al. (2001) provided an empirical example ofan <strong>in</strong>tervention with both positive <strong>and</strong> negative mediators. 14In a recent article, Shrout <strong>and</strong> Bolger (2002) recommended that<strong>in</strong>clusion of the first step <strong>in</strong> the Kenny model be based on whetherthe predictor is temporally distal or proximal to the outcome.Specifically, they recommended skipp<strong>in</strong>g the first step of theKenny model <strong>in</strong> cases <strong>in</strong> which the predictor is distal to theoutcome (such as <strong>in</strong> a long-term longitud<strong>in</strong>al study), because suchstudies often will lack power to detect the direct predictor–outcome relation. However, when the predictor is proximal to theoutcome, or when theory suggests that the predictor–outcomerelation is at least medium <strong>in</strong> size, they recommended reta<strong>in</strong><strong>in</strong>g thefirst step <strong>in</strong> the Kenny model.Choos<strong>in</strong>g <strong>Mediator</strong> VariablesOn a conceptual level, the proposed relations between the predictor<strong>and</strong> the mediator should be grounded <strong>in</strong> theory <strong>and</strong> clearlyarticulated. In other words, the rationale for the hypothesis that thepredictor is related to or causes the mediator should have a cleartheoretical rationale (see Holmbeck, 1997, for examples <strong>in</strong> whichthis rationale is lack<strong>in</strong>g). Furthermore, given that the mediationalmodel essentially is one <strong>in</strong> which the predictor causes the mediator,which <strong>in</strong> turn causes the outcome, the mediator ideally shouldbe someth<strong>in</strong>g that can be changed (MacK<strong>in</strong>non et al., 2000).Once potential mediators have been identified on theoreticalgrounds, there are practical issues to consider with regard tochoos<strong>in</strong>g specific mediators to test. In particular, the relationsamong the mediator, predictor, <strong>and</strong> outcome can affect the powerof tests of mediation. For example, the power associated with thetests of the relations between the mediator <strong>and</strong> outcome (Path b <strong>in</strong>Figure 3B) <strong>and</strong> between the predictor <strong>and</strong> the outcome controll<strong>in</strong>gfor the mediator (Path c <strong>in</strong> Figure 3B) decreases as the relation14 A situation <strong>in</strong>volv<strong>in</strong>g mediation but no significant predictor–outcomerelation does not necessarily have to <strong>in</strong>volve multiple mediators. Thissituation occurs more generally when the c path is opposite <strong>in</strong> sign to theab path (Kenny et al., 1998). In this case, the mediator is a suppressorvariable (see Cohen et al., 2003; MacK<strong>in</strong>non et al., 2000; <strong>and</strong> Shrout &Bolger, 2002, for further discussion of suppression effects). Suppressionoccurs when the relation between a predictor <strong>and</strong> outcome becomes largerwhen the suppressor variable is <strong>in</strong>cluded <strong>in</strong> the equation (as opposed tobecom<strong>in</strong>g smaller when a significant mediator is <strong>in</strong>cluded <strong>in</strong> the equation).


MODERATOR AND MEDIATOR EFFECTS127between the predictor variable <strong>and</strong> the mediator <strong>in</strong>creases (Kennyet al., 1998). That is, when more variance <strong>in</strong> the mediator isexpla<strong>in</strong>ed by the predictor, there is less variance <strong>in</strong> the mediator tocontribute to the prediction of the outcome. Thus, as the predictor–mediator (Path a <strong>in</strong> Figure 3B) relation <strong>in</strong>creases, a larger sampleis needed to have the same amount of power to test the effects ofPath b (mediator–outcome) <strong>and</strong> Path c (predictor–outcome controll<strong>in</strong>gfor mediator), as would be the case if the relation betweenthe predictor <strong>and</strong> mediator was smaller. Kenny et al. provided aformula to determ<strong>in</strong>e the “effective sample size” given the correlationbetween the predictor <strong>and</strong> the mediator: N (1 r xm 2 ), whereN is the sample size <strong>and</strong> r xm is the correlation between thepredictor <strong>and</strong> the mediator. For example, if your sample size is900, <strong>and</strong> the predictor–mediator correlation is .30, the effectivesample size is 819. However, if the predictor–mediator correlationis .70, the effective sample size is only 459. In other words,because of the high correlation between the predictor <strong>and</strong> mediator,power reduces to what it would be if your sample were 459rather than 900 (thus, the sample size is effectively 459 rather than900). Hoyle <strong>and</strong> Kenny (1999) presented the results of a simulationstudy demonstrat<strong>in</strong>g the effect of the size of the relation betweenthe predictor <strong>and</strong> the mediator on the power of tests of mediation.Another factor to consider <strong>in</strong> choos<strong>in</strong>g mediators (from amongtheoretically viable c<strong>and</strong>idates) is the size of the relation betweenthe mediator <strong>and</strong> outcome (Path b <strong>in</strong> Figure 3B) relative to the sizeof the relation between the predictor <strong>and</strong> mediator (Path a <strong>in</strong> Figure3B). Accord<strong>in</strong>g to Kenny et al. (1998), the relation between themediator <strong>and</strong> outcome (Path b) <strong>and</strong> between the predictor <strong>and</strong> themediator (Path a) should be comparable <strong>in</strong> size. However, Hoyle<strong>and</strong> Kenny (1999) noted that the power of tests of mediation isgreatest when the relation between the mediator <strong>and</strong> the outcome(Path b) exceeds the relation between the predictor <strong>and</strong> the mediator(Path a). Thus, <strong>in</strong> choos<strong>in</strong>g mediators, it is important to choosevariables that are likely to have similar relations with the predictor<strong>and</strong> outcome variable (Path a Path b), or somewhat strongerrelations with the outcome than with the predictor (Path b Patha), to maximize the power of the mediational test. These po<strong>in</strong>tswere rarely, if ever, addressed <strong>in</strong> the JCP studies we reviewed,although they sometimes were an issue (e.g., correlations betweenpredictors <strong>and</strong> mediators greater than .6 or stronger relationsbetween predictors <strong>and</strong> mediators than between mediators <strong>and</strong>outcomes).Once theoretically based mediators have been identified thatsatisfy the criteria just described with regard to their relations withthe predictor <strong>and</strong> outcome, another factor to consider is the reliabilityof the measure of the mediator. Specifically, with lowerreliability, the effect of the mediator on the outcome variable (Pathb <strong>in</strong> Figure 3B) is underestimated, <strong>and</strong> the effect of the predictorvariable on the outcome variable (Path c <strong>in</strong> Figure 3B) is overestimated(Baron & Kenny, 1986; Judd & Kenny, 1981; Kenny etal., 1998). Thus, statistical analyses, such as multiple regression,that ignore measurement error underestimate mediation effects.Hoyle <strong>and</strong> Rob<strong>in</strong>son (2003) have provided a formula for estimat<strong>in</strong>gthe effects of unreliability on tests of mediation <strong>and</strong> recommendus<strong>in</strong>g a measure with a reliability of at least .90. They alsodescribe four ways of model<strong>in</strong>g measurement error <strong>in</strong> SEM if sucha highly reliable measure is not available <strong>and</strong> argue that the bestapproach is to use multiple measures <strong>and</strong> multiple measurementstrategies. Most of the JCP mediation studies we reviewed usedmultiple regression or used SEM programs to conduct a pathanalysis with s<strong>in</strong>gle <strong>in</strong>dicator variables (versus a model with latentvariables). As a result, they did not take advantage of one of theprimary reasons to use SEM (i.e., to model measurement error).This is important, because some mediators had either low (e.g.,less than .70) or unreported reliability. None of the studies usedmultiple measurement strategies.Establish<strong>in</strong>g CausationThe process of mediation implies a causal cha<strong>in</strong>; thus, def<strong>in</strong>itionsof mediation are almost always phrased <strong>in</strong> causal terms (see,e.g., Baron & Kenny, 1986; Hoyle & Smith, 1994; James & Brett,1984; Judd & Kenny, 1981; Kenny et al., 1998; Kraemer et al.,2001). For example, Hoyle <strong>and</strong> Smith (1994) described a mediationalhypothesis as “a secondary one that follows demonstrationof an effect (assumed to be causal)” (p. 437; i.e., the predictor–outcome relation is causal). The mediator also is assumed to becaused by the predictor variable <strong>and</strong> to cause the outcome variable(Kenny et al., 1998). Consequently, the criteria for establish<strong>in</strong>gcausation need to be considered <strong>in</strong> study design.The three primary criteria for establish<strong>in</strong>g that one variablecauses another are that (a) there is an association between the twovariables (association), (b) the association is not spurious (isolation),<strong>and</strong> (c) the cause precedes the effect <strong>in</strong> time (direction;Hoyle & Smith, 1994; Menard, 1991). Satisfaction of these criteriacan be seen as fall<strong>in</strong>g along a cont<strong>in</strong>uum, with one end of thecont<strong>in</strong>uum def<strong>in</strong>ed by nonexperimental correlational studies thatmerely establish an association between two variables <strong>and</strong> theother def<strong>in</strong>ed by experiments with r<strong>and</strong>om assignment to conditions.Accord<strong>in</strong>g to Wegener <strong>and</strong> Fabrigar (2000), even us<strong>in</strong>g anonexperimental design, one can move farther along the cont<strong>in</strong>uumby controll<strong>in</strong>g for the effects of other variables (isolation) orby collect<strong>in</strong>g longitud<strong>in</strong>al data (direction; see Hoyle & Smith,1994; Menard, 1991). Hoyle <strong>and</strong> Rob<strong>in</strong>son (2003) have arguedthat the best approach is the “replicative strategy” <strong>in</strong> which allmeasures are adm<strong>in</strong>istered at more than one po<strong>in</strong>t <strong>in</strong> time, whichwould require at least three assessments to assess mediation <strong>in</strong> alongitud<strong>in</strong>al study (Coll<strong>in</strong>s et al., 1998; Farrell, 1994). This ispreferred over the sequential strategy <strong>in</strong> which the predictor,mediator, <strong>and</strong> outcome are measured at different po<strong>in</strong>ts <strong>in</strong> time,because this design does not permit <strong>in</strong>ferences of directionality.With regard to the mediation studies published <strong>in</strong> JCP <strong>in</strong> 2001,all were nonexperimental, <strong>and</strong> little attention was paid to designfeatures that would strengthen claims regard<strong>in</strong>g causation. Forexample, only one study was longitud<strong>in</strong>al (but used a sequentialrather than a replicative strategy), <strong>and</strong> only one controlled for athird variable that might affect the relation between the predictor<strong>and</strong> the outcome. Nonetheless, most researchers used causal language<strong>in</strong> describ<strong>in</strong>g their hypotheses <strong>and</strong> results.PowerIn the section on moderation, we noted that tests of <strong>in</strong>teractionsoften have low power. The same is true of tests of mediation. Wepreviously reviewed factors that can decrease the power of tests ofmediation (e.g., high correlation between the mediator <strong>and</strong> thepredictor). MacK<strong>in</strong>non et al. (2002) recently performed a simulationstudy <strong>in</strong> which they compared the power of different methods


128 FRAZIER, TIX, AND BARRONof test<strong>in</strong>g mediation (see also Shrout & Bolger, 2002). The “causalsteps” method described by Kenny (Baron & Kenny, 1986; Judd &Kenny, 1981; Kenny et al., 1998) was found to have adequatepower only when sample sizes were large (greater than 500) orwhen the mediated effects were large. For example, the power todetect a medium effect with a sample size of 100 was only .28. Thestep requir<strong>in</strong>g a significant effect of the predictor on the outcome(which we previously referred to as Step 1) led to the most TypeII errors (i.e., lower power). Readers are encouraged to consult theMacK<strong>in</strong>non et al. article for more <strong>in</strong>formation on alternative mediationtests. Hoyle <strong>and</strong> Kenny (1999) also performed a simulationstudy <strong>in</strong> which they exam<strong>in</strong>ed the effects of several factors (e.g.,reliability of the mediator) on the power of tests of mediation. Onlysamples of 200 had sufficient power (greater than .80). In design<strong>in</strong>gstudies, researchers need to estimate sample sizes a priori,us<strong>in</strong>g sources such as these, to ensure that the study has sufficientpower. In the JCP studies reviewed, power was rarely mentioned,<strong>and</strong> most sample sizes were less than 200.Analyz<strong>in</strong>g the DataMediational analyses can be performed with either multipleregression or SEM. The logic of the analyses is the same <strong>in</strong> bothcases. In general, SEM is considered the preferred method (Baron& Kenny, 1986; Hoyle & Smith, 1994; Judd & Kenny, 1981;Kenny et al., 1998). Some of the advantages of SEM are that it cancontrol for measurement error, provides <strong>in</strong>formation on the degreeof fit of the entire model, <strong>and</strong> is much more flexible than regression.For example, you can <strong>in</strong>clude multiple predictor variables,multiple outcome variables, <strong>and</strong> multiple mediators 15 <strong>in</strong> the modelas well as other potential causes of the mediator <strong>and</strong> outcome,<strong>in</strong>clud<strong>in</strong>g longitud<strong>in</strong>al data (Baron & Kenny, 1986; Hoyle &Smith, 1994; Judd & Kenny, 1981; MacK<strong>in</strong>non, 2000; Qu<strong>in</strong>tana &Maxwell, 1999; Wegener & Fabrigar, 2000). However, <strong>in</strong> researchareas <strong>in</strong> which it may be difficult to recruit a sufficiently largesample to perform SEM analyses (e.g., at least 200; see Qu<strong>in</strong>tana& Maxwell, 1999), it may be necessary to use multiple regression(Holmbeck, 1997). Furthermore, accord<strong>in</strong>g to MacK<strong>in</strong>non (2000),regression is the most common method for test<strong>in</strong>g mediation (seeHoyle & Kenny, 1999, for a simulation study compar<strong>in</strong>g regressionwith SEM for test<strong>in</strong>g mediation). Therefore, we first describemethods for test<strong>in</strong>g mediation us<strong>in</strong>g regression <strong>and</strong> then describemethods us<strong>in</strong>g SEM.As mentioned, the method outl<strong>in</strong>ed by Kenny (e.g., Baron &Kenny, 1986; Kenny et al., 1998) is the most commonly usedapproach <strong>in</strong> the psychological literature. Us<strong>in</strong>g multiple regression,this approach <strong>in</strong>volves test<strong>in</strong>g three equations. First, theoutcome variable is regressed on the predictor to establish thatthere is an effect to mediate (see Path c <strong>in</strong> Figure 3A). Second, themediator is regressed on the predictor variable to establish Path a(see Figure 3B) <strong>in</strong> the mediational cha<strong>in</strong>. In the third equation, theoutcome variable is regressed on both the predictor <strong>and</strong> the mediator.This provides a test of whether the mediator is related to theoutcome (Path b) as well as an estimate of the relation between thepredictor <strong>and</strong> the outcome controll<strong>in</strong>g for the mediator (Path c). Ifthe relation between the predictor <strong>and</strong> the outcome controll<strong>in</strong>g forthe mediator is zero, the data are consistent with a completemediation model (i.e., the mediator completely accounts for therelation between the predictor <strong>and</strong> outcome). If the relation betweenthe predictor <strong>and</strong> the outcome is significantly smaller whenthe mediator is <strong>in</strong> the equation (Path c) than when the mediator isnot <strong>in</strong> the equation (Path c), but still greater than zero, the datasuggest partial mediation. However, it is not enough to show thatthe relation between the predictor <strong>and</strong> outcome is smaller or nolonger is significant when the mediator is added to the model.Rather, one of several methods for test<strong>in</strong>g the significance of themediated effect should be used (see MacK<strong>in</strong>non et al., 2002, for acomparison of several different methods <strong>and</strong> Shrout & Bolger,2002, for an alternative bootstrapp<strong>in</strong>g procedure). MacK<strong>in</strong>non etal.’s review of published studies <strong>in</strong>dicated that the majority did nottest the significance of the mediat<strong>in</strong>g variable effect. This also wastrue of the mediation studies published <strong>in</strong> JCP <strong>in</strong> 2001.The method described by Kenny et al. (1998) to test the significanceof the mediated effect is as follows: Because the differencebetween the total effect of the predictor on the outcome (Path c <strong>in</strong>Figure 3A) <strong>and</strong> the direct effect of the predictor on the outcome(Path c <strong>in</strong> Figure 3B) is equal to the product of the paths from thepredictor to the mediator (Path a) <strong>and</strong> from the mediator to theoutcome (Path b), the significance of the difference between Pathsc <strong>and</strong> c can be assessed by test<strong>in</strong>g the significance of the productsof Paths a <strong>and</strong> b. Specifically, the product of Paths a <strong>and</strong> b isdivided by a st<strong>and</strong>ard error term. The mediated effect divided byits st<strong>and</strong>ard error yields a z score of the mediated effect. If the zscore is greater than 1.96, the effect is significant at the .05 level.The error term used by Kenny <strong>and</strong> colleagues (Baron & Kenny,1986; Kenny et al., 1998) is the square root of b 2 sa 2 a 2 sb 2 sa 2 sb 2 , where a <strong>and</strong> b are unst<strong>and</strong>ardized regression coefficients<strong>and</strong> sa <strong>and</strong> sb are their st<strong>and</strong>ard errors. Note that this differs fromSobel’s (1982) test, which is the most commonly used st<strong>and</strong>arderror. Sobel’s test does not <strong>in</strong>clude the last term (sa 2 sb 2 ), whichtypically is small. These two methods performed very similarly <strong>in</strong>MacK<strong>in</strong>non et al.’s (2002) simulation study.Although we are focus<strong>in</strong>g here on the use of multiple regression,as mentioned, there are various ways to test mediational models <strong>in</strong>SEM. Holmbeck (1997) described a strategy for SEM that isvirtually identical to that used with regression (i.e., test<strong>in</strong>g the fitof the predictor–outcome model <strong>and</strong> the fit of the predictor–mediator–outcome model, as well as the predictor–mediator <strong>and</strong>mediator–outcome paths). These analyses provide tests of Steps 1through 3 outl<strong>in</strong>ed previously. To test the significance of themediated effect, the fit of the predictor–mediator–outcome modelis compared with <strong>and</strong> without the direct path from the predictor<strong>and</strong> outcome constra<strong>in</strong>ed to zero. A mediational model is supportedif the model with the direct path between the predictor <strong>and</strong>outcome does not provide a better fit to the data (i.e., the directpath between the predictor <strong>and</strong> outcome is not significant). Hoyle<strong>and</strong> Smith (1994) described a somewhat simpler approach <strong>in</strong>which the predictor–outcome path is compared <strong>in</strong> models with <strong>and</strong>without the mediator. As <strong>in</strong> regression, if the predictor–outcome15 Procedures for assess<strong>in</strong>g multiple mediators <strong>in</strong> regression have beendescribed by MacK<strong>in</strong>non (2000). Cohen et al. (2003, pp. 460–467) alsodescribed methods for calculat<strong>in</strong>g total, direct, <strong>in</strong>direct, <strong>and</strong> spurious effectsfor multiple variables us<strong>in</strong>g multiple regression. In addition, severalauthors have provided detailed accounts of test<strong>in</strong>g multiple mediator models<strong>in</strong> SEM (e.g., Brown, 1997; MacK<strong>in</strong>non, 2000; MacK<strong>in</strong>non et al.,2001).


MODERATOR AND MEDIATOR EFFECTS129path is zero with the mediator <strong>in</strong> the model, there is evidence ofcomplete mediation. The significance of the mediated effect alsocan be obta<strong>in</strong>ed <strong>in</strong> a s<strong>in</strong>gle model by multiply<strong>in</strong>g the coefficientsfor Paths a (predictor to mediator) <strong>and</strong> b (mediator to outcome).Tests of the significance of <strong>in</strong>direct effects are available <strong>in</strong> mostSEM programs (see Brown, 1997, for a description of test<strong>in</strong>gmediation models <strong>in</strong> LISREL). Few of the JCP studies reviewedthat assessed mediated or <strong>in</strong>direct effects used the Kenny frameworkeither <strong>in</strong> regression or SEM. Some compared models with<strong>and</strong> without direct paths from the predictor to the outcome, <strong>and</strong>some only reported the significance of the <strong>in</strong>direct effects.Interpret<strong>in</strong>g the ResultsIf a researcher has found support for all of the conditions formediation mentioned earlier, what conclusions are appropriate?Next, we briefly describe three factors to consider when <strong>in</strong>terpret<strong>in</strong>gthe results of mediation analyses.Alternative Equivalent ModelsOne issue that must be acknowledged when <strong>in</strong>terpret<strong>in</strong>g theresults of mediational analyses is that, even if the four conditionsmentioned earlier are met, there are likely to be other models thatare consistent with the data that also are correct (Kenny et al.,1998; Qu<strong>in</strong>tana & Maxwell, 1999). MacCallum, Wegener, Uch<strong>in</strong>o,<strong>and</strong> Fabrigar (1993) provided a complete discussion of this issue,<strong>and</strong> we encourage readers to refer to their analysis. Briefly, theyshowed that for any given model there generally are alternativemodels with different patterns of relations among variables that fitthe data as well as the orig<strong>in</strong>al model. Accord<strong>in</strong>g to their review of53 published studies that used SEM, approximately 87% of themodels had alternative equivalent models, with the average be<strong>in</strong>g12. MacCallum et al. provided rules for calculat<strong>in</strong>g the number ofalternative equivalent models, although some SEM programs (e.g.,AMOS) have options that will generate all of the possible modelsfrom the observed variables <strong>and</strong> calculate the percentage of thepossible models whose fit is better than, worse than, or comparableto the orig<strong>in</strong>al model. Of course, some of these models can berejected on the basis of the mean<strong>in</strong>gfulness of the model, <strong>and</strong>MacCallum et al. reviewed design factors that affect the mean<strong>in</strong>gfulnessof alternative models. For example, paths to experimentallymanipulated variables would not be mean<strong>in</strong>gful (e.g., thatcounsel<strong>in</strong>g outcomes cause assignment to treatment condition), norwould paths that move backward <strong>in</strong> time <strong>in</strong> longitud<strong>in</strong>al studies.Thus, the number of alternative models is greater when the data arecross-sectional <strong>and</strong> correlational.Even with experimental or longitud<strong>in</strong>al data, there are likely tobe alternative models that fit the data, <strong>and</strong> researchers are encouragedto identify <strong>and</strong> test such models. However, most researchers<strong>in</strong> the studies reviewed by MacCallum et al. (1993) <strong>and</strong> <strong>in</strong> thestudies that we reviewed <strong>in</strong> JCP asserted the validity of theirmodel on the basis of goodness-of-fit <strong>in</strong>dexes without acknowledg<strong>in</strong>gor test<strong>in</strong>g any alternatives. The existence of equivalentmodels “presents a serious challenge to the <strong>in</strong>ferences typicallymade by researchers” (MacCallum et al., 1993, p. 196).Omitted VariablesAnother problem that may affect the <strong>in</strong>terpretation of mediationalanalyses is omitted variables. Specifically, mediational analysesmay yield biased estimates if variables that cause both themediator <strong>and</strong> the outcome are not <strong>in</strong>cluded <strong>in</strong> the model, becausethe association between the mediator <strong>and</strong> the outcome may be dueto third variables that cause both (James & Brett, 1984; Kenny etal., 1998). Judd <strong>and</strong> Kenny (1981) provided a detailed example <strong>in</strong>which add<strong>in</strong>g a variable that is related to both the mediator <strong>and</strong> theoutcome substantially changes the results of the mediational analysis.Although this is a difficult problem to solve, there are waysto address it. For example, common causes of the mediator <strong>and</strong> theoutcome, such as social desirability, can be <strong>in</strong>cluded directly <strong>in</strong> themodel. In addition, us<strong>in</strong>g different methods (e.g., self-reports <strong>and</strong>peer rat<strong>in</strong>gs) to measure the mediator <strong>and</strong> outcome will reduce theextent to which they are correlated because of common methodvariance.CausationWe discussed causation under the section on study design, butthis topic also is relevant to <strong>in</strong>terpret<strong>in</strong>g the results of mediationanalyses. All of the studies test<strong>in</strong>g mediation that we reviewed <strong>in</strong>JCP were nonexperimental, as was true of most of the studiesus<strong>in</strong>g SEM reviewed by MacK<strong>in</strong>non et al. (2002). In the JCPstudies, causal language often was used even though causal <strong>in</strong>ferencesgenerally cannot be made on the basis of nonexperimentaldata (Cohen et al., 2003; Hoyle & Smith, 1994; Kraemer et al.,2001). James <strong>and</strong> Brett (1984) recommended that researchersattend to all conditions necessary for establish<strong>in</strong>g causation beforeconduct<strong>in</strong>g mediational tests <strong>and</strong> us<strong>in</strong>g these tests to support causal<strong>in</strong>ferences. If one or more sources of specification error is viable(e.g., misspecification of causal direction or an unmeasured variableproblem), exploratory procedures should be used <strong>and</strong> <strong>in</strong>terpretedonly <strong>in</strong> correlational terms (e.g., the correlation between thepredictor <strong>and</strong> outcome is dim<strong>in</strong>ished if the mediator is controlled).16 However, <strong>in</strong> this case the mediator cannot be said toexpla<strong>in</strong> how the predictor <strong>and</strong> outcome are related, which essentiallydefeats the purpose of test<strong>in</strong>g a mediational model. Otherstake a somewhat more liberal stance, argu<strong>in</strong>g that, with correlationaldata, all that can be said is that the causal model is consistentwith the data (Kraemer et al., 2001). In this case, it must beacknowledged that other models also are consistent with the data,as discussed previously.ConclusionsThere are many issues to consider when design<strong>in</strong>g, conduct<strong>in</strong>g,<strong>and</strong> <strong>in</strong>terpret<strong>in</strong>g mediational analyses. We acknowledge that it isnot possible for every consideration mentioned to be addressed <strong>in</strong>every study. However, accord<strong>in</strong>g to our review of mediationalresearch published <strong>in</strong> JCP, there def<strong>in</strong>itely is room for improvement.For example, virtually all of the mediational analyses <strong>in</strong> thestudies we reviewed were performed with cross-sectional correlationaldata. Very few attempts were made to control for commoncauses of the mediator <strong>and</strong> the outcome. The direction of the16 In this example, a mediator is very similar to a confound<strong>in</strong>g variable,which is a variable that distorts the relation between two other variables(MacK<strong>in</strong>non et al., 2000). For example, once the effect of age is controlled,<strong>in</strong>come is no longer related to cancer prevalence. Unlike mediation, confound<strong>in</strong>gdoes not imply causality.


130 FRAZIER, TIX, AND BARRONrelations among variables often was unclear. Nonetheless, authorstypically discussed results us<strong>in</strong>g causal language. Authors sometimesacknowledged that no causal conclusions could be drawn,even though they used causal language. In most of the studies wereviewed, the authors concluded that their model fit the datawithout acknowledg<strong>in</strong>g or test<strong>in</strong>g alternative models. However,the best evidence for mediation requires show<strong>in</strong>g not only that thedata are consistent with the proposed mediation model but also thatother models are either theoretically implausible or <strong>in</strong>consistentwith the data (Smith, 2000). If a compell<strong>in</strong>g argument cannot bemade for the superiority of one model over another, additionalresearch needs to be conducted to dist<strong>in</strong>guish among the alternativemodels (MacCallum et al., 1993).Example: <strong>Test<strong>in</strong>g</strong> Mediation Us<strong>in</strong>g Multiple RegressionTo illustrate how mediator effects may be <strong>in</strong>vestigated withmultiple regression, we aga<strong>in</strong> provide a step-by-step example. Asmentioned, <strong>in</strong> this case we use actual data to illustrate issues thatarise when us<strong>in</strong>g real, versus simulated, data.Design<strong>in</strong>g the StudyThe data we are us<strong>in</strong>g to illustrate the process of conduct<strong>in</strong>gmediational analyses with multiple regression were collected byPatricia A. Frazier from 894 women who responded to a r<strong>and</strong>omdigitdial<strong>in</strong>g telephone survey regard<strong>in</strong>g traumatic experiences <strong>and</strong>posttraumatic stress disorder (PTSD). Participants were askedwhether they had experienced several traumatic events <strong>and</strong>, if so,to <strong>in</strong>dicate which was their worst lifetime trauma. These eventshad occurred an average of 10 years previously. One f<strong>in</strong>d<strong>in</strong>g fromthis study was that <strong>in</strong>dividuals whose self-nom<strong>in</strong>ated worst eventhappened directly to them (e.g., sexual assault) reported morecurrent symptoms of PTSD than those whose worst events did nothappen directly to them (e.g., life-threaten<strong>in</strong>g illness of a closefriend or family member). Although this may seem obvious, theDiagnostic <strong>and</strong> Statistical Manual of Mental Disorders (4th ed.;American Psychiatric Association, 1994) does not dist<strong>in</strong>guish betweendirectly <strong>and</strong> <strong>in</strong>directly experienced events <strong>in</strong> the stressorcriterion for PTSD. Because event type (directly vs. <strong>in</strong>directlyexperienced) was associated with PTSD symptoms (i.e., significantpredictor–outcome relationship), exam<strong>in</strong><strong>in</strong>g mediators of thisrelationship can help us to underst<strong>and</strong> why directly experiencedevents are more likely to lead to PTSD than are <strong>in</strong>directly experiencedevents.The mediator we chose to exam<strong>in</strong>e was self-blame. We choseself-blame as a potential mediator because previous theory <strong>and</strong>research suggest that it is one of the strongest correlates of posttraumaticdistress (e.g., Weaver & Clum, 1995). It also seemed that<strong>in</strong>dividuals would be more likely to blame themselves for eventsthat happened directly to them (e.g., an accident or a sexualassault) than for events that happened to others (e.g., lifethreaten<strong>in</strong>gillness of a close friend). Other possible mediatorswere rejected because, although they might be associated withhigher levels of PTSD symptoms, they were unlikely to have beencaused by the experience of a “direct” trauma. For example,<strong>in</strong>dividuals who have experienced more lifetime traumas reportmore symptoms of PTSD, but it seemed unlikely that experienc<strong>in</strong>ga direct trauma would cause one to experience more lifetimetraumas. In addition, unlike past traumas, self-blame is a factor thatcan be changed. Thus, the mediational hypothesis we tested wasthat <strong>in</strong>dividuals will blame themselves more for directly experiencedevents (Path a), <strong>and</strong> <strong>in</strong>dividuals who engage <strong>in</strong> more selfblamewill report more PTSD symptoms (Path b). F<strong>in</strong>ally, wehypothesized that once the relation between self-blame <strong>and</strong> PTSDsymptoms was accounted for, there would be a weaker relationbetween event type (directly vs. <strong>in</strong>directly experienced events) <strong>and</strong>PTSD (i.e., Path c will be smaller than Path c). Thus, self-blamewas hypothesized to be a partial (vs. complete) mediator. Givenour sample size (N 894), we had sufficient power to detectmedium to large mediated effects (MacK<strong>in</strong>non et al., 2002).Analyz<strong>in</strong>g the DataTable 2 conta<strong>in</strong>s the analyses necessary to exam<strong>in</strong>e this mediationalhypothesis. Follow<strong>in</strong>g the steps outl<strong>in</strong>ed earlier for test<strong>in</strong>gmediation, we first established that event type (the predictor) wasrelated to PTSD symptoms (the outcome) by regress<strong>in</strong>g PTSDsymptoms on the event-type variable (Step 1). The unst<strong>and</strong>ardizedregression coefficient (B 1.32) associated with the effect ofevent type on number of PTSD symptoms was significant ( p Table 2<strong>Test<strong>in</strong>g</strong> <strong>Mediator</strong> <strong>Effects</strong> Us<strong>in</strong>g Multiple Regression<strong>Test<strong>in</strong>g</strong> steps <strong>in</strong> mediation model B SE B 95% CI <strong>Test<strong>in</strong>g</strong> Step 1 (Path c)Outcome: current PTSD symptomsPredictor: event type (direct vs. <strong>in</strong>direct) a 1.32 0.22 0.90, 1.75 .21**<strong>Test<strong>in</strong>g</strong> Step 2 (Path a)Outcome: self-blamePredictor: event type 0.50 0.04 0.41, 0.58 .38**<strong>Test<strong>in</strong>g</strong> Step 3 (Paths b <strong>and</strong> c)Outcome: current PTSD symptoms<strong>Mediator</strong>: self-blame (Path b) 0.95 0.18 0.60, 1.29 .19**Predictor: event type 0.86 0.23 0.41, 1.31 .13**Note. CI confidence <strong>in</strong>terval; PTSD posttraumatic stress disorder.a 0 <strong>in</strong>directly experienced trauma, 1 directly experienced trauma.** p .001.


MODERATOR AND MEDIATOR EFFECTS131.0001). Thus, Path c was significant, <strong>and</strong> the requirement formediation <strong>in</strong> Step 1 was met. To establish that event type wasrelated to self-blame (the hypothesized mediator), we regressedself-blame on the event type variable (Step 2). The unst<strong>and</strong>ardizedregression coefficient (B 0.50) associated with this relation alsowas significant at the p .0001 level, <strong>and</strong> thus the condition forStep 2 was met (Path a was significant). To test whether self-blamewas related to PTSD symptoms, we regressed PTSD symptomssimultaneously on both self-blame <strong>and</strong> the event type variable(Step 3). The coefficient associated with the relation betweenself-blame <strong>and</strong> PTSD (controll<strong>in</strong>g for event type) also was significant(B 0.95, p .0001). Thus, the condition for Step 3 wasmet (Path b was significant). This third regression equation alsoprovided an estimate of Path c, the relation between event type<strong>and</strong> PTSD, controll<strong>in</strong>g for self-blame. When that path is zero, thereis complete mediation. However, Path c was 0.86 <strong>and</strong> still significant( p .001), although it was smaller than Path c (which was1.32).There are several ways to assess whether this drop from 1.32 to0.86 (i.e., from c to c) is significant. Because c c is equal to theproduct of Paths a <strong>and</strong> b, the significance of the difference betweenc <strong>and</strong> c can be estimated by test<strong>in</strong>g the significance of theproducts of Paths a <strong>and</strong> b. Specifically, you divide the product ofPaths a <strong>and</strong> b by a st<strong>and</strong>ard error term. Although there are severaldifferent ways to calculate this st<strong>and</strong>ard error term, we used theerror term used by Kenny <strong>and</strong> colleagues (Baron & Kenny, 1986;Kenny et al., 1998) described earlier: the square root of b 2 sa 2 a 2 sb 2 sa 2 sb 2 , where a <strong>and</strong> b are unst<strong>and</strong>ardized regressioncoefficients <strong>and</strong> sa <strong>and</strong> sb are their st<strong>and</strong>ard errors. We used thisterm because it is likely to be more familiar to readers throughKenny’s writ<strong>in</strong>gs. To reiterate, the mediated effect divided by itsst<strong>and</strong>ard error yields a z score of the mediated effect. If the z scoreis greater than 1.96, the effect is significant at the .05 level. In ourcase, we multiplied the unst<strong>and</strong>ardized regression coefficientweights for Path a (0.50) <strong>and</strong> Path b (0.95) <strong>and</strong> divided by thesquare root of (0.90)(0.002) (0.25)(0.03) (0.002)(0.03), whichyielded 0.475/0.097 4.90. Thus, self-blame was a significantmediator even though the c path was significant. 17 Shrout <strong>and</strong>Bolger (2002) also recommended calculat<strong>in</strong>g the confidence <strong>in</strong>tervalaround the estimate of the <strong>in</strong>direct effect. The formula forcalculat<strong>in</strong>g a 95% confidence <strong>in</strong>terval is the product of Paths a <strong>and</strong>b s ab z.975, where z.975 is equal to the constant 1.96 <strong>and</strong> s ab isthe st<strong>and</strong>ard error term calculated earlier. For our example, the95% confidence <strong>in</strong>terval would be 0.475 0.097 (1.96) 0.29 to0.67. This confidence <strong>in</strong>terval does not <strong>in</strong>clude zero, which isconsistent with the conclusion that there is mediation (i.e., the<strong>in</strong>direct effect is not zero).Another way to describe the amount of mediation is <strong>in</strong> terms ofthe proportion of the total effect that is mediated, which is def<strong>in</strong>edby ab/c (Shrout & Bolger, 2002). Us<strong>in</strong>g the unst<strong>and</strong>ardized regressioncoefficients from our example, we get 0.475/1.32 .36.Thus, about 36% of the total effect of event type on PTSDsymptoms is mediated by self-blame. However, a sample size of atleast 500 is needed for accurate po<strong>in</strong>t <strong>and</strong> variance estimates of theproportion of total effect mediated (MacK<strong>in</strong>non, Warsi, & Dwyer,1995). Also, it is important to note that this is just a way ofdescrib<strong>in</strong>g the amount of mediation rather than a test of thesignificance of the mediated effect.Interpret<strong>in</strong>g the ResultsWhat can we conclude from this test of our hypothesis thatself-blame partially mediates the relation between type of traumaexperienced <strong>and</strong> PTSD symptoms (i.e., that directly experiencedevents result <strong>in</strong> more PTSD because they lead to more self-blame,which <strong>in</strong> turn leads to more PTSD symptoms)? In terms of causation,a strong argument can be made that the traumatic event (thepredictor) preceded both self-blame (the mediator) <strong>and</strong> PTSD (theoutcome). However, it could be the case that <strong>in</strong>dividuals who aresuffer<strong>in</strong>g from more PTSD symptoms are more likely to blamethemselves (i.e., that the outcome causes the mediator). In fact,when we tested this alternative model, PTSD also was a significantmediator of the relation between event type <strong>and</strong> self-blame. Thus,there are alternative models that are consistent with the data. Wealso did not control for other factors that may be related to or causeboth self-blame <strong>and</strong> PTSD, such as the personality trait neuroticism.Thus, all we can say at this po<strong>in</strong>t is that our data areconsistent with models <strong>in</strong> which self-blame causes PTSD <strong>and</strong>PTSD causes self-blame. We also must acknowledge that themediational relations we found might not have been evident ifother variables that cause both self-blame <strong>and</strong> PTSD had been<strong>in</strong>cluded <strong>in</strong> the model.How does this example compare with the design considerationsmentioned before? First, we began by establish<strong>in</strong>g that there wasa significant predictor–outcome relation. We next established atheoretical rationale for why the predictor variable would be relatedto the mediator self-blame <strong>and</strong> chose a mediator that potentiallyis alterable. Ideally, these decisions regard<strong>in</strong>g potential mediatorvariables are made before data collection. The Path arelationship between the predictor <strong>and</strong> the mediator ( .38) wasnot so high that multicoll<strong>in</strong>earity would be a problem. However,the Path a relation between the predictor <strong>and</strong> mediator ( .38)was larger than the Path b relation between the mediator <strong>and</strong> theoutcome ( .19); our power would have been greater if Path bwere equal to or larger than Path a. In addition, our measure ofself-blame was not without measurement error, which reduces thepower of the test of the mediated effect. With regard to the fourthstep, even though the relation between the predictor <strong>and</strong> outcomerema<strong>in</strong>ed significant after the mediator had been controlled, a testof the mediated effect revealed that there was significant mediation.Nonetheless, given measurement error <strong>in</strong> the mediator <strong>and</strong> thefact that Path a was larger than Path b, both of which reduce thepower of the test of mediation, we may have underestimated theextent to which self-blame mediated the relation between eventtype <strong>and</strong> PTSD.After f<strong>in</strong>d<strong>in</strong>g data consistent with a mediational model, whatcomes next? First, given that there are alternative models that mayfit the data equally well, it is important to conduct additionalstudies that can help rule out these alternative models. Experimentallymanipulat<strong>in</strong>g the predictor or the outcome would help to ruleout alternative models <strong>in</strong> which the mediator causes the predictoror the outcome causes the mediator. However, experimental manipulationwould not be ethical <strong>in</strong> cases like our example. As17 An onl<strong>in</strong>e calculator for this test is available that provides tests ofmediation us<strong>in</strong>g three different error terms (Preacher & Leonardelli, 2003).In our example, the mediated effect is significant regardless of which errorterm is used.


132 FRAZIER, TIX, AND BARRONdescribed earlier, alternative models also can be tested <strong>in</strong> nonexperimentalstudies that <strong>in</strong>corporate design features that might ruleout alternative models (e.g., by collect<strong>in</strong>g longitud<strong>in</strong>al data, <strong>in</strong>clud<strong>in</strong>gother mediators, measur<strong>in</strong>g common causes of the mediator<strong>and</strong> outcome, <strong>and</strong> us<strong>in</strong>g multiple measurement strategies). Ifplausible alternative models are rejected, a mediator might suggestareas of <strong>in</strong>tervention (Baron & Kenny, 1986). For example, if wef<strong>in</strong>d stronger evidence that self-blame causes PTSD (rather thanthe other way around), we might want to develop an <strong>in</strong>terventionto decrease self-blame <strong>and</strong> thereby reduce PTSD symptoms.CONCLUSIONIn summary, we have offered counsel<strong>in</strong>g researchers step-bystepguides to test<strong>in</strong>g mediator <strong>and</strong> moderator effects that can beused both <strong>in</strong> plann<strong>in</strong>g their own research <strong>and</strong> <strong>in</strong> evaluat<strong>in</strong>g publishedresearch <strong>and</strong> have provided an example of each type ofanalysis. We hope that these guides will <strong>in</strong>crease the extent towhich counsel<strong>in</strong>g researchers move beyond the test<strong>in</strong>g of directeffects <strong>and</strong> <strong>in</strong>clude mediators <strong>and</strong> moderators <strong>in</strong> their analyses <strong>and</strong>improve the quality of those tests when they are conducted. Towardthese ends, the Appendix conta<strong>in</strong>s checklists to use <strong>in</strong> design<strong>in</strong>g<strong>and</strong> evaluat<strong>in</strong>g tests of moderator <strong>and</strong> mediator effects.ADDITIONAL RESOURCESAlthough we have presented additional resources throughout thearticle, we want to highlight a few of these resources. As mentionedpreviously, it is important for researchers to consult theseprimary sources (<strong>and</strong> new sources as they emerge) to ga<strong>in</strong> a betterunderst<strong>and</strong><strong>in</strong>g of the underly<strong>in</strong>g statistical <strong>and</strong> conceptual issues <strong>in</strong>test<strong>in</strong>g moderation <strong>and</strong> mediation. Helpful sources that cover bothmediation <strong>and</strong> moderation are the classic article by Baron <strong>and</strong>Kenny (1986) <strong>and</strong> the new regression textbook by Cohen et al.(2003). Particularly helpful sources for further <strong>in</strong>formation onmoderation <strong>in</strong>clude Aiken <strong>and</strong> West (1991), Jaccard et al. (1990),<strong>and</strong> West et al. (1996). Useful sources for further <strong>in</strong>formation onmediation <strong>in</strong>clude Kenny et al. (1998), Shrout <strong>and</strong> Bolger (2002),<strong>and</strong> MacK<strong>in</strong>non et al. (2002). F<strong>in</strong>ally, MacK<strong>in</strong>non (2003) <strong>and</strong>Kenny (2003) both have Web sites to which one can submitquestions regard<strong>in</strong>g mediation. A user-friendly guide to test<strong>in</strong>g<strong>in</strong>teractions us<strong>in</strong>g multiple regression also can be found on theWorld Wide Web (Preacher & Rucker, 2003).ReferencesAgu<strong>in</strong>is, H. (1995). Statistical power problems with moderated multipleregression <strong>in</strong> management research. Journal of Management Research,21, 1141–1158.Agu<strong>in</strong>is, H., Boik, R. J., & Pierce, C. A. (2001). A generalized solution forapproximat<strong>in</strong>g the power to detect effects of categorical moderatorvariables us<strong>in</strong>g multiple regression. Organizational Research Methods,4, 291–323.Agu<strong>in</strong>is, H., Bommer, W. H., & Pierce, C. A. (1996). Improv<strong>in</strong>g theestimation of moderat<strong>in</strong>g effects by us<strong>in</strong>g computer-adm<strong>in</strong>istered questionnaires.Educational <strong>and</strong> Psychological Measurement, 56, 1043–1047.Agu<strong>in</strong>is, H., Petersen, S. A., & Pierce, C. A. (1999). Appraisal of thehomogeneity of error variance assumption <strong>and</strong> alternatives to multipleregression for estimat<strong>in</strong>g moderat<strong>in</strong>g effects of categorical variables.Organizational Research Methods, 2, 315–339.Agu<strong>in</strong>is, H., & Pierce, C. A. (1998). Statistical power computations fordetect<strong>in</strong>g dichotomous moderator variables with moderated multipleregression. Educational <strong>and</strong> Psychological Measurement, 58, 668–676.Agu<strong>in</strong>is, H., & Stone-Romero, E. F. (1997). Methodological artifacts <strong>in</strong>moderated multiple regression <strong>and</strong> their effects on statistical power.Journal of Applied <strong>Psychology</strong>, 82, 192–206.Aiken, L. S., & West, S. G. (1991). Multiple regression: <strong>Test<strong>in</strong>g</strong> <strong>and</strong><strong>in</strong>terpret<strong>in</strong>g <strong>in</strong>teractions. Newbury Park, CA: Sage.Alex<strong>and</strong>er, R. A., & DeShon, R. P. (1994). Effect of error varianceheterogeneity on the power of tests for regression slope differences.Psychological Bullet<strong>in</strong>, 115, 308–314.American Psychiatric Association. (1994). Diagnostic <strong>and</strong> statistical manualof mental disorders (4th ed.). Wash<strong>in</strong>gton, DC: Author.Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variabledist<strong>in</strong>ction <strong>in</strong> social psychological research: Conceptual, strategic, <strong>and</strong>statistical considerations. Journal of Personality <strong>and</strong> Social <strong>Psychology</strong>,51, 1173–1182.Bissonnette, V., Ickes, W., Bernste<strong>in</strong>, I., & Knowles, E. (1990). Personalitymoderat<strong>in</strong>g variables: A warn<strong>in</strong>g about statistical artifact <strong>and</strong> a comparisonof analytic techniques. Journal of Personality, 58, 567–587.Bollen, K. A., & Paxton, P. (1998). Interactions of latent variables <strong>in</strong>structural equation models. Structural Equation Model<strong>in</strong>g, 5, 267–293.Brown, R. L. (1997). Assess<strong>in</strong>g specific mediational effects <strong>in</strong> complextheoretical models. Structural Equation Model<strong>in</strong>g, 4, 142–156.Busemeyer, J., & Jones, L. R. (1983). Analysis of multiplicative causalrules when the causal variables are measured with error. PsychologicalBullet<strong>in</strong>, 93, 549–562.Chapl<strong>in</strong>, W. F. (1991). The next generation <strong>in</strong> moderation research <strong>in</strong>personality psychology. Journal of Personality, 59, 143–178.Cohen, J. (1983). The cost of dichotomization. Applied PsychologicalMeasurement, 7, 249–253.Cohen, J. (1992). A power primer. Psychological Bullet<strong>in</strong>, 112, 155–159.Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlationanalysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multipleregression/correlation analysis for the behavioral sciences (3rd ed.).Mahwah, NJ: Erlbaum.Coll<strong>in</strong>s, L. M., Graham, J. W., & Flaherty, B. P. (1998). An alternativeframework for def<strong>in</strong><strong>in</strong>g mediation. Multivariate Behavioral Research,33, 295–312.Corn<strong>in</strong>g, A. F. (2002). Self-esteem as a moderator between perceiveddiscrim<strong>in</strong>ation <strong>and</strong> psychological distress among women. Journal ofCounsel<strong>in</strong>g <strong>Psychology</strong>, 49, 117–126.Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws <strong>in</strong>analyses recently proposed. Psychological Bullet<strong>in</strong>, 102, 414–417.Cross, S. E., & Madson, L. (1997). Models of the self: Self-construals <strong>and</strong>gender. Psychological Bullet<strong>in</strong>, 122, 5–37.DeShon, R. P., & Alex<strong>and</strong>er, R. A. (1996). Alternative procedures fortest<strong>in</strong>g regression slope homogeneity when group error variances areunequal. Psychological Methods, 1, 261–277.Dunlap, W. P., & Kemery, E. R. (1987). Failure to detect moderat<strong>in</strong>geffects: Is multicoll<strong>in</strong>earity the problem? Psychological Bullet<strong>in</strong>, 102,418–420.Farrell, A. D. (1994). Structural equation model<strong>in</strong>g with longitud<strong>in</strong>al data:Strategies for exam<strong>in</strong><strong>in</strong>g group differences <strong>and</strong> reciprocal relationships.Journal of Consult<strong>in</strong>g <strong>and</strong> Cl<strong>in</strong>ical <strong>Psychology</strong>, 62, 477–487.Frazier, P., Tix, A., & Barnett, C. L. (2003). The relational context of socialsupport. Personality <strong>and</strong> Social <strong>Psychology</strong> Bullet<strong>in</strong>, 29, 1113–1146.Friedrich, R. J. (1982). In defense of multiplicative terms <strong>in</strong> multipleregression equations. American Journal of Political Science, 26, 797–833.Grissom, R. (2000). Heterogeneity of variance <strong>in</strong> cl<strong>in</strong>ical data. Journal ofConsult<strong>in</strong>g <strong>and</strong> Cl<strong>in</strong>ical <strong>Psychology</strong>, 68, 155–165.Holmbeck, G. N. (1997). Toward term<strong>in</strong>ological, conceptual, <strong>and</strong> statisti-


MODERATOR AND MEDIATOR EFFECTS133cal clarity <strong>in</strong> the study of mediators <strong>and</strong> moderators: Examples from thechild-cl<strong>in</strong>ical <strong>and</strong> pediatric psychology literatures. Journal of Consult<strong>in</strong>g<strong>and</strong> Cl<strong>in</strong>ical <strong>Psychology</strong>, 65, 599–610.Hoyle, R. H., & Kenny, D. A. (1999). Sample size, reliability, <strong>and</strong> tests ofstatistical mediation. In R. Hoyle (Ed.), Statistical strategies for smallsample research (pp. 195–222). Thous<strong>and</strong> Oaks, CA: Sage.Hoyle, R. H., & Rob<strong>in</strong>son, J. I. (2003). Mediated <strong>and</strong> moderated effects <strong>in</strong>social psychological research: Measurement, design, <strong>and</strong> analysis issues.In C. Sansone, C. Morf, & A. T. Panter (Eds.), H<strong>and</strong>book of methods <strong>in</strong>social psychology. Thous<strong>and</strong> Oaks, CA: Sage.Hoyle, R. H., & Smith, G. T. (1994). Formulat<strong>in</strong>g cl<strong>in</strong>ical research hypothesesas structural models: A conceptual overview. Journal of Consult<strong>in</strong>g<strong>and</strong> Cl<strong>in</strong>ical <strong>Psychology</strong>, 62, 429–440.Jaccard, J., Turrisi, R., & Wan, C. K. (1990). Interaction effects <strong>in</strong> multipleregression. Newbury Park, CA: Sage.Jaccard, J., & Wan, C. K. (1995). Measurement error <strong>in</strong> the analysis of<strong>in</strong>teraction effects between cont<strong>in</strong>uous predictors us<strong>in</strong>g multiple regression:Multiple <strong>in</strong>dicator <strong>and</strong> structural equation approaches. PsychologicalBullet<strong>in</strong>, 117, 348–357.Jaccard, J., & Wan, C. K. (1996). LISREL approaches to <strong>in</strong>teraction effects<strong>in</strong> multiple regression. Thous<strong>and</strong> Oaks, CA: Sage.James, L. R., & Brett, J. M. (1984). <strong>Mediator</strong>s, moderators, <strong>and</strong> tests formediation. Journal of Applied <strong>Psychology</strong>, 69, 307–321.Judd, C. M., & Kenny, D. A. (1981). Process analysis: Estimat<strong>in</strong>g mediation<strong>in</strong> treatment evaluations. Evaluation Review, 5, 602–619.Judd, C. M., McClell<strong>and</strong>, G. H., & Culhane, S. E. (1995). Data analysis:Cont<strong>in</strong>u<strong>in</strong>g issues <strong>in</strong> the everyday analysis of psychological data. AnnualReview of <strong>Psychology</strong>, 46, 433–465.Kenny, D. (2003). Mediation. Retrieved November 10, 2003, from http://users.rcn.com/dakenny/mediate.htmKenny, D. A., & Judd, C. M. (1984). Estimat<strong>in</strong>g the l<strong>in</strong>ear <strong>and</strong> <strong>in</strong>teractiveeffects of latent variables. Psychological Bullet<strong>in</strong>, 105, 361–373.Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis <strong>in</strong> socialpsychology. In D. T. Gilbert, S. T. Fiske, & G. L<strong>in</strong>dzey (Eds.), Theh<strong>and</strong>book of social psychology (4th ed., pp. 233–265). New York:Oxford University Press.Kraemer, H. C., Stice, E., Kazd<strong>in</strong>, A., Offord, D., & Kupfer, D. (2001).How do risk factors work together? <strong>Mediator</strong>s, moderators, <strong>and</strong> <strong>in</strong>dependent,overlapp<strong>in</strong>g, <strong>and</strong> proxy risk factors. American Journal of Psychiatry,158, 848–856.Lakey, B., & Drew, J. B. (1997). A social-cognitive perspective on socialsupport. In G. R. Pierce, B. Lakey, I. G. Sarason, & B. R. Sarason (Eds.),Sourcebook of social support <strong>and</strong> personality (pp. 107–140). New York:Plenum Press.Lee, R. M., Draper, M., & Lee, S. (2001). Social connectedness, dysfunctional<strong>in</strong>terpersonal behaviors, <strong>and</strong> psychological distress: <strong>Test<strong>in</strong>g</strong> amediator model. Journal of Counsel<strong>in</strong>g <strong>Psychology</strong>, 48, 310–318.Lub<strong>in</strong>ski, D., & Humphreys, L. G. (1990). Assess<strong>in</strong>g spurious “moderatoreffects”: Illustrated substantively with the hypothesized (“synergistic”)relation between spatial <strong>and</strong> mathematical ability. Psychological Bullet<strong>in</strong>,107, 385–393.MacCallum, R. C., & Mar, C. M. (1995). Dist<strong>in</strong>guish<strong>in</strong>g between moderator<strong>and</strong> quadratic effects <strong>in</strong> multiple regression. Psychological Bullet<strong>in</strong>,118, 405–421.MacCallum, R. C., Wegener, D. T., Uch<strong>in</strong>o, B. N., & Fabrigar, L. R.(1993). The problem of equivalent models <strong>in</strong> applications of covariancestructure analysis. Psychological Bullet<strong>in</strong>, 114, 185–199.MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). Onthe practice of dichotomization of quantitative variables. PsychologicalMethods, 7, 19–40.MacK<strong>in</strong>non, D. P. (1994). Analysis of mediat<strong>in</strong>g variables <strong>in</strong> prevention<strong>and</strong> <strong>in</strong>tervention research. In A. Cazares & L. A. Beatty (Eds.), Scientificmethods for prevention <strong>in</strong>tervention research (NIDA Research Monograph139, DHHS Publication No. 94-3631, pp. 127–153). Wash<strong>in</strong>gton,DC: U.S. Government Pr<strong>in</strong>t<strong>in</strong>g Office.MacK<strong>in</strong>non, D. P. (2000). Contrasts <strong>in</strong> multiple mediator models. In J. S.Rose, L. Chass<strong>in</strong>, C. C. Presson, & S. J. Sherman (Eds.), Multivariateapplications <strong>in</strong> substance use research: New methods for new questions(pp. 141–160). Mahwah, NJ: Erlbaum.MacK<strong>in</strong>non, D. (2003). Mediation. Retrieved November 10, 2003, fromhttp://www.public.asu.edu/davidpm/ripl/mediate.htmMacK<strong>in</strong>non, D. P., & Dwyer, J. H. (1993). Estimat<strong>in</strong>g mediated effects <strong>in</strong>prevention studies. Evaluation Review, 17, 144–158.MacK<strong>in</strong>non, D. P., Goldberg, L., Clarke, G. N., Elliot, D. L., Cheong, J.,Lap<strong>in</strong>, A., et al. (2001). Mediat<strong>in</strong>g mechanisms <strong>in</strong> a program to reduce<strong>in</strong>tentions to use anabolic steroids <strong>and</strong> improve exercise self-efficacy<strong>and</strong> dietary behavior. Prevention Science, 2, 15–27.MacK<strong>in</strong>non, D. P., Krull, J. L., & Lockwood, C. (2000). Mediation,confound<strong>in</strong>g, <strong>and</strong> suppression: Different names for the same effect.Prevention Science, 1, 173–181.MacK<strong>in</strong>non, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., &Sheets, V. (2002). A comparison of methods to test mediation <strong>and</strong> other<strong>in</strong>terven<strong>in</strong>g variable effects. Psychological Methods, 7, 83–104.MacK<strong>in</strong>non, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation studyof mediated effect measures. Multivariate Behavioral Research, 30,41–62.Marsh, H. W. (2002, April). Structural equation models of latent <strong>in</strong>teractions:Evaluation of alternative strategies. Paper presented at the meet<strong>in</strong>gof the American Educational Research Association, New Orleans,LA.Mason, C. A., Tu, S., & Cauce, A. M. (1996). Assess<strong>in</strong>g moderatorvariables: Two computer simulation studies. Educational <strong>and</strong> PsychologicalMeasurement, 56, 45–62.Maxwell, S. E., & Delaney, H. D. (1993). Bivariate median splits <strong>and</strong>spurious statistical significance. Psychological Bullet<strong>in</strong>, 113, 181–190.McClell<strong>and</strong>, G. H., & Judd, C. M. (1993). Statistical difficulties of detect<strong>in</strong>g<strong>in</strong>teractions <strong>and</strong> moderator effects. Psychological Bullet<strong>in</strong>, 114,376–390.Menard, S. (1991). Longitud<strong>in</strong>al research: Quantitative applications <strong>in</strong> thesocial sciences. Newbury Park, CA: Sage.Moulder, B. C., & Alg<strong>in</strong>a, J. (2002). Comparison of methods for estimat<strong>in</strong>g<strong>and</strong> test<strong>in</strong>g latent variable <strong>in</strong>teractions. Structural Equation Model<strong>in</strong>g, 9,1–19.Norcross, J. (2001). Purposes, processes, <strong>and</strong> products of the task force onempirically supported therapy relationships. Psychotherapy, 38, 345–356.Overton, R. C. (2001). Moderated multiple regression for <strong>in</strong>teractions<strong>in</strong>volv<strong>in</strong>g categorical variables: A statistical control for heterogeneousvariance across two groups. Psychological Methods, 6, 218–233.P<strong>in</strong>g, R. A., Jr. (1996). Latent variable <strong>in</strong>teraction <strong>and</strong> quadratic effectestimation: A two-step technique us<strong>in</strong>g structural equation analysis.Psychological Bullet<strong>in</strong>, 119, 166–175.Preacher, K., & Rucker, D. (2003). A primer on <strong>in</strong>teraction effects <strong>in</strong>multiple l<strong>in</strong>ear regression. Retrieved November 10, 2003, from http://www.unc.edu/preacher/lcamlm/<strong>in</strong>teractions.htmPreacher, K., & Leonardelli, G. (2003). Calculation for the Sobel test: An<strong>in</strong>teractive calculation tool for mediation tests. Retrieved November 10,2003, from http://www.unc.edu/preacher/sobel/sobel.htmQu<strong>in</strong>tana, S. M., & Maxwell, S. E. (1999). Implications of recent developments<strong>in</strong> structural equation model<strong>in</strong>g for counsel<strong>in</strong>g psychology. TheCounsel<strong>in</strong>g Psychologist, 27, 485–527.Russell, C. J., & Bobko, P. (1992). Moderated regression analysis <strong>and</strong>Likert scales: Too coarse for comfort. Journal of Applied <strong>Psychology</strong>,77, 336–342.Schumacker, R., & Marcoulides, G. (Eds.). (1998). Interaction <strong>and</strong> nonl<strong>in</strong>eareffects <strong>in</strong> structural equation model<strong>in</strong>g. Mahwah, NJ: Erlbaum.Shrout, P. E., & Bolger, N. (2002). Mediation <strong>in</strong> experimental <strong>and</strong> non-


134 FRAZIER, TIX, AND BARRONexperimental studies: New procedures <strong>and</strong> recommendations. PsychologicalMethods, 7, 422–445.Smith, E. R. (2000). Research design. In H. T. Reis & C. M. Judd (Eds.),H<strong>and</strong>book of research methods <strong>in</strong> social <strong>and</strong> personality psychology (pp.17–39). New York: Cambridge University Press.Sobel, M. E. (1982). Asymptotic confidence <strong>in</strong>tervals for <strong>in</strong>direct effects <strong>in</strong>structural equation models. In S. Le<strong>in</strong>hardt (Ed.), Sociological methodology1982 (pp. 290–312). Wash<strong>in</strong>gton, DC: American SociologicalAssociation.Stone-Romero, E. F., Alliger, G., & Agu<strong>in</strong>is, H. (1994). Type II errorproblems <strong>in</strong> the use of moderated multiple regression for the detection ofmoderat<strong>in</strong>g effects of dichotomous variables. Journal of Management,20, 167–178.Stone-Romero, E. F., & Anderson, L. E. (1994). Techniques for detect<strong>in</strong>gmoderat<strong>in</strong>g effects: Relative statistical power of multiple regression <strong>and</strong>the comparison of subgroup-based correlation coefficients. Journal ofApplied <strong>Psychology</strong>, 79, 354–359.Weaver, T., & Clum, G. (1995). Psychological distress associated with<strong>in</strong>terpersonal violence: A meta-analysis. Cl<strong>in</strong>ical <strong>Psychology</strong> Review,15, 115–140.Wegener, D., & Fabrigar, L. (2000). Analysis <strong>and</strong> design for nonexperimentaldata address<strong>in</strong>g causal <strong>and</strong> noncausal hypotheses. In H. T. Reis& C. M. Judd (Eds.), H<strong>and</strong>book of research methods <strong>in</strong> social <strong>and</strong>personality psychology (pp. 412–450). New York: Cambridge UniversityPress.West, S. G., Aiken, L. S., & Krull, J. L. (1996). Experimental personalitydesigns: Analyz<strong>in</strong>g categorical by cont<strong>in</strong>uous variable <strong>in</strong>teractions.Journal of Personality, 64, 1–49.AppendixChecklists for Evaluat<strong>in</strong>g Moderation <strong>and</strong> Mediation AnalysesChecklist for Evaluat<strong>in</strong>g <strong>Moderator</strong> Analyses Us<strong>in</strong>g MultipleRegressionWas a strong theoretical rationale for the <strong>in</strong>teraction provided? Was thespecific form of the <strong>in</strong>teraction specified?Was power calculated a priori? If power was low, was this mentioned asa limitation?Was the size of the <strong>in</strong>teraction considered a priori to determ<strong>in</strong>e neededsample size?Was the overall effect size considered a priori to determ<strong>in</strong>e neededsample size?If there was a categorical predictor or moderator, were the sample sizesfor each group relatively equal?Was the assumption of homogeneous error variance checked if categoricalvariables were used? If the assumption was violated, were alternativetests used?Were cont<strong>in</strong>uous variables sufficiently reliable (e.g., above .80)?Were cont<strong>in</strong>uous variables normally distributed (i.e., no rangerestriction)?Was the outcome variable sensitive enough to capture the <strong>in</strong>teraction?Were regression procedures used whether variables were categorical orcont<strong>in</strong>uous? Were cont<strong>in</strong>uous variables kept cont<strong>in</strong>uous?If there were categorical variables, was the cod<strong>in</strong>g scheme used appropriatefor the research questions?Were cont<strong>in</strong>uous variables centered or st<strong>and</strong>ardized?Was the <strong>in</strong>teraction term created correctly (i.e., by multiply<strong>in</strong>g thepredictor <strong>and</strong> moderator variables)?Was the equation structured correctly (i.e., predictor <strong>and</strong> moderatorentered before the <strong>in</strong>teraction term)?Were first-order effects <strong>in</strong>terpreted correctly given the cod<strong>in</strong>g systemused?Were unst<strong>and</strong>ardized (rather than st<strong>and</strong>ardized) coefficients <strong>in</strong>terpreted?Was the significance of the <strong>in</strong>teraction term assessed appropriately (i.e.,by exam<strong>in</strong><strong>in</strong>g the change <strong>in</strong> R 2 associated with the <strong>in</strong>teraction term)?Was the <strong>in</strong>teraction plotted if significant?Were the simple slopes compared?If covariates were used, were <strong>in</strong>teractions with other terms <strong>in</strong> the modelassessed?If multiple moderators were tested, was Type I error addressed?Did the <strong>in</strong>terpretation of the results reflect the actual effect size of the<strong>in</strong>teraction?Checklist for Evaluat<strong>in</strong>g Mediation Analyses Us<strong>in</strong>g MultipleRegressionWas the predictor significantly related to the outcome? If not, was therea conv<strong>in</strong>c<strong>in</strong>g rationale for exam<strong>in</strong><strong>in</strong>g mediation?Was there a theoretical rationale for the hypothesis that the predictorcauses the mediator? Was the mediator someth<strong>in</strong>g that can be changed?What is the “effective sample size” given the correlation between thepredictor <strong>and</strong> mediator? That is, was the relation between the predictor <strong>and</strong>mediator so high as to compromise power?Was the relation between the predictor <strong>and</strong> the outcome (Path b) greaterthan or equal to the relation between the predictor <strong>and</strong> the mediator (Patha)?Were the mediators adequately reliable (e.g., .90)?Was unreliability <strong>in</strong> the mediators (e.g., .70) addressed through teststhat estimate the effects of unreliability or the use of SEM?To what extent did the design of the study enable causal <strong>in</strong>ferences?Was power mentioned either as an a priori consideration or as a limitation?Were all four steps <strong>in</strong> establish<strong>in</strong>g mediation addressed <strong>in</strong> the statisticalanalyses?Was the significance of the mediation effect formally tested?Were alternative equivalent models acknowledged or tested?Were variables that seem likely to cause both the mediator <strong>and</strong> theoutcome <strong>in</strong>cluded <strong>in</strong> analyses, or were multiple measurement methodsused?Did the study design allow for the type of causal language used <strong>in</strong> the<strong>in</strong>terpretation of results?Received March 18, 2003Revision received July 1, 2003Accepted July 3, 2003

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!