Evaluating organizational stress-management interventions using ...

EUROPEAN JOURNAL OF WORK AND 

ORGANIZATIONAL PSYCHOLOGY 

2005, 14 (1), 23–41 

Evaluating organizational stress-management 

interventions using adapted study designs 

Raymond Randall 

Department of Psychology, City University, London, and 

Institute of Work, Health and Organizations, University of Nottingham, UK 

Amanda Griffiths and Tom Cox 

Institute of Work, Health and Organizations, University of Nottingham, UK 

The evaluation of organizational stress management interventions has proved 

challenging for researchers and practitioners alike. Traditionally, researcher 

designed quasi-experiments have been regarded as the method for evaluating 

such interventions. However, relatively few such studies have been satisfactorily 

completed in organizations, and many of those that have did not 

adequately take account of intervention processes. This article presents an 

approach to evaluation that can help to overcome these problems. Two 

empirical studies are presented that demonstrate that measurement of the 

intervention process can be used to adapt and shape the design of the 

evaluation. In both studies, process evaluation incorporating the measurement 

of intervention exposure was used to partition participant samples (into 

intervention and control groups). This approach has the potential to enable 

and strengthen quantitative outcome evaluation in situations where controlled 

quasi-experimentation is not possible. 

Organizational-level stress management interventions are designed to deal 

with the sources of the problem by changing the design, management, and 

organization of work (Cox, Griffiths, & Rial-Gonzalez, 2000b; Semmer, 

2003). Organizational-level stress prevention is either implicitly or explicitly 

endorsed by a number of European Governments (Griffiths, 2003; Griffiths, 

Cox, & Barlow, 1996; Health and Safety Commission, 1999; Kompier, de 

Correspondence should be addressed to Dr Raymond Randall, Department of Psychology, 

City University, Northampton Square, London, EC1V 0HB, UK. Email: r.randall@city.ac.uk 

The authors would like to thank the UK Health and Safety Executive, The Royal College of 

Nursing, and UNISON for their support for the work presented in this article. The views and 

opinions expressed are those of the authors and not those of any other individual or 

organization. 

# 2005 Psychology Press Ltd 

http://www.tandf.co.uk/journals/pp/1359432X.html DOI: 10.1080/13594320444000209

24 RANDALL, GRIFFITHS, COX 

Gier, Smulders, & Draaisma, 1994) and by the European Commission 

(1989, Article 6:2). Because it targets the causes of work stress such a ‘‘risk 

assessment – risk reduction’’ strategy (Cox, Griffiths, Barlow, Randall, 

Thomson, & Rial-Gonzalez, 2000a) should be the most effective in the 

long-term (Cooper, Liukkonen, & Cartwright, 1996; Cox, 1993; Cox, 

Griffiths, & Randall, 2002a, 2002b; Ivancevich, Matteson, Freedman, & 

Phillips, 1990; Murphy, 1996; van der Hek & Plomp, 1997). However, two 

problems mean that this argument is, at present, difficult to justify. First, for 

some time there has been a dearth of adequate evaluation studies of the 

effectiveness of such interventions (Briner & Reynolds, 1999; Cox, 1993; 

Parkes & Sparkes, 1998; Reynolds, 1997; Semmer, 2003). Second, many 

evaluations are limited by the undermeasurement of intervention processes 

(Cox et al., 2000b; Griffiths, 1999; Kompier & Kristensen, 2000; Murphy, 

1996; Parkes & Sparkes, 1998; Semmer, 2003). The aim of this article is to 

describe a modified approach to evaluation that helps to address these two 

important interrelated problems. 

TRADITIONAL EVALUATION STRATEGIES 

Traditionally quasi-experiments have been used to evaluate interventions for 

work-related stress because the constraints of the organizational setting and 

the nature of the interventions do not support the conditions required for a 

‘‘true’’ experiment (Campbell, 1957; Cook & Campbell, 1979; Parkes & 

Sparks, 1998). However, much of the existing stress management evaluation 

research literature has implied that two features of the ‘‘true’’ experiment 

have to be retained in order to provide a robust enough evaluation (e.g., 

Briner & Reynolds, 1999; Murphy, 1996; Parkes & Sparkes, 1998). The first 

point is that fixed (stable) study designs (based around the controlled or 

predictable manipulation of exposure) should be used; and the second point 

is that outcome evaluation should be paramount, since it has often been 

(erroneously) assumed that strong quasi-experimental designs make process 

evaluation redundant (Cook & Shadish, 1994). 

However, the complexity and instability of organizations tends to make it 

difficult (or even impracticable) to establish, and then adequately control, 

the delivery of interventions to achieve even the simplest of quasiexperimental 

study designs (Griffiths, 1999; Kompier & Kristensen, 2000; 

Mikkelsen, Saksvik, & Landsbergis, 2000). Moreover, the process of 

implementing interventions can modify indented exposure patterns (by 

stopping interventions reaching their intended participants and vice versa) 

and cannot be ignored in outcome evaluation (Griffiths, 1999). 

This situation presents serious problems for summative evaluation 

strategies (i.e., those that focus on the outcome evaluation, often at the 

expense of process evaluation) that rely upon fixed or predictable exposure

EVALUATION OF INTERVENTIONS 25 

patterns (Colarelli, 1998; Griffiths, 1999; Hartley, 2002; Heaney, Israel, 

Schurman, Baker, House, & Hugentobler, 1993). In unpredictable or 

uncontrolled settings such an approach (1) raises the risk of Type III error 

(erroneously concluding an intervention is ineffective when it is actually its 

implementation that is faulty; Dobson & Cook, 1980) and (2) limits 

explanatory yield (e.g., inconsistent intervention effects remain difficult to 

explain; see Cox et al., 2000b; Parkes & Sparkes, 1998). Moreover, 

controlled or predictable intervention exposure patterns occur so infrequently 

that there is a need for alternative ways of managing quantitative 

evaluation 1 that are viable in the face of unpredictable and uncontrollable 

exposure patterns (Colarelli, 1998; Kompier, Aust, van den Berg, & Seigrist, 

2000a). In summary, the identification of causal relationships may be 

hindered unless study designs are adapted to reflect true, but uncontrollable 

and unpredictable, patterns of intervention exposure. 

ADAPTED STUDY DESIGNS AS AN EVALUATION 

STRATEGY 

Applied social scientists (such as those evaluating public health promotion 

or large-scale community education programmes) have achieved good 

results by being flexible in their application of the principles of study design 

(Fitzgerald & Rasheed, 1998; Harachi, Abbot, Catalano, Haggerty, & 

Fleming, 1999; Lipsey & Corday, 2000). When, for example, working in 

community settings they have adapted study designs, through the use of 

process evaluation, to reflect actual intervention exposure patterns (Lipsey, 

1996; Lipsey & Corday, 2000). On-going or post hoc measures of 

intervention exposure (i.e., process evaluation: see Kompier & Kristensen, 

2000; Yin, 1994, 1995; Yin & Kaftarian, 1997) have been used to identify or 

adapt the evaluation design so that the evaluation can ‘‘work backward from 

the target clientele and what they actually receive/experience, not forward 

from the intervention activities and what the intervention agents purportedly 

deliver’’ (Lipsey, 1996, p. 301). 

Given the sometimes insurmountable difficulties associated with intentionally 

introducing and controlling intervention exposure, this flexible 

approach to evaluation offers a practical means of evaluating stress 

management interventions. Data on exposure to interventions can be 

obtained through an intervention process evaluation (i.e., questioning 

participants about their experiences and triangulating those data with 

documentary information and by interviewing those involved in planning 

and implementing interventions; Griffiths, 1999; Nytro, Saksvik, Mikkelsen, 

1 The authors recognize that qualitative methods also offer viable alternative approaches to 

evaluation in chaotic organizational settings (see Kompier et al., 2000a).


Bohle, & Quinlan, 2000; Saksvik, Nytro, Dahl-Jorgensen, & Mikkelsen, 

2002). This process evaluation can then be used to adapt outcome 

evaluation by using it to determine whether each participant is more 

appropriately placed in a intervention/exposed or control/not exposed group. 

Measured exposure patterns can thus be exploited as an evaluation design 

variable. This approach is different to that used in ‘‘natural experiments’’ 

where exposure patterns are predictable and controlled (see Jackson, 1983). 

Rather, it is a constructive use of the manipulation check: The study design 

is adapted to reflect actual exposure patterns. 

Treating uncontrolled and unpredictable exposure patterns as a natural 

manipulation may help to expand the size of the ‘‘pool’’ of organizational 

intervention evaluation research and make informative evaluation possible 

in chaotic organizational settings. 

THE PRESENT STUDY 

This article presents two empirical studies drawn from the authors’ research 

on risk management and work-related stress (Cox et al., 2000a; Cox, 

Randall, & Griffiths, 2002b). Together they illustrate that through a simple 

exploration of the intervention processes, exposure patterns can be identified 

and used to adapt the final study design and analysis. This achieves two 

things: It measures actual exposure to the intervention to allow a valid 

evaluation of its effectiveness (thus controlling Type III error), and it 

permits informative evaluation where exposure patterns cannot be planned 

or tightly controlled. Both studies examine, ad hoc, the actual ‘‘organizational 

penetration’’ (Cox et al., 2000a) of an intervention into a participant 

group. This strategy enables evaluation without the need to (1) make 

possibly untenable assumptions about stable intervention exposure patterns, 

(2) to control exposure to an intervention, or (3) to rely on events or 

organizational structures to coincidentally support predictable and stable 

exposure patterns. Both studies have one central hypothesis: 

The intervention effect (improved well-being) will only be apparent when measures 

of actual intervention exposure are used to partition the participant group prior to 

analysis (i.e., actual intervention exposure is a moderator of change). 

METHOD 

Design 

In both studies, the interventions were designed by stakeholders from within 

the participating organizations, through the feedback of risk assessment 

data and discussions facilitated by the research team. These discussions were


based on an initial problem analysis carried out using a risk assessment 

questionnaire survey of working conditions. Both studies began with a 

simple and traditional pre (Time 1) – post (Time 2) longitudinal intervention 

design. Because of the constraints operating within the organizations 

involved it was not possible to allocate participants to ‘‘intervention’’ and 

‘‘control’’ groups: each intervention was intended to reach all study 

participants. In both studies, participants were asked to report on their 

awareness of (Study 1) or their involvement in (Study 2) the intervention. In 

both studies the dependent variable was self-reported well-being (measured 

in terms of levels of exhaustion). Time 2 measures of well-being and 

intervention exposure were taken 18 months after the Time 1 measures of 

well-being. The evaluation design was adapted by partitioning the 

participants groups according to their reported exposure to the intervention 

(using a Measured exposure 6 Time interaction term) in a repeated 

measures analysis of covariance (ANCOVA). 

Participants and interventions 

Study 1: Railway staff. Thirty-seven station managers from a railway 

transport company provided data at both Time 1 and Time 2. This 

represented a response rate of approximately 50% (based on the number of 

supervisors providing both Time 1 and Time 2 data). Response rates for 

Times 1 and 2, taken separately were higher (68% and 64%). All were male 

with an average age of 40 years (SD = 8.0) and average tenure of 14 years 

(SD = 7.1) at Time 1. The supervisors worked at different sized stations: 

Some stations were larger and busier than others. Inspection of company 

records indicated that the supervisors returning questionnaires came from a 

representative sample of stations and that they did not differ from the whole 

station manager population in terms of average age or length of service. 

Some months before Time 1, a central part of the station managers’ role 

had been changed. Because of budgetary constraints, their responsibility for 

managing the repair of faulty station equipment (including reporting faults 

and authorizing and managing repairs 2 ) had been removed. At Time 1 

senior managers, not supervisors, managed equipment repair. However, 

when the risk assessment data collected at Time 1 was reported, the 

organization interpreted these as indicating that the removal of roles and 

responsibilities had not been well received by staff. In an attempt to improve 

staff satisfaction and well-being, responsibility for managing the repair of 

faulty station equipment was returned to station supervisors (i.e., super- 

2 This does not refer to track and trains, but rather to equipment found in the station 

building. Inspection and repair of track equipment was carried out by other specialist members 

of staff.


visors were instructed to resume reporting faults and instigating their 

resolution). The intention was that this change be communicated to all 

station supervisors through a number of levels of management through two 

media: via (1) written memos from senior management delivered through 

established communication routes and (2) verbal communications in a 

variety of forums (e.g., individual and team meetings). 

Study 2: Hospital staff. Participants were 31 senior paediatric nursing 

staff with significant managerial and administrative responsibilities in 

addition to a specialist clinical workload. They worked in a large urban 

hospital. A response rate of 52% was achieved (70% at Time 1 and 66% at 

Time 2 respectively). All participants were female, with the majority (56%) 

being aged between 36 and 45 years, with most (also 56%) having worked in 

the hospital for more than 11 years. 3 The nurses worked in 15 different 

wards, each with its own specialty including oncology, orthopaedic, and 

outpatients. Comparisons between the demographic data obtained from the 

risk assessment questionnaire (age, length of service, and size of ward 

worked in) and the hospital’s records indicated that the nurses completing 

questionnaires were representative of the whole sample. 

The rationale driving the intervention in this study was relatively 

straightforward. The risk assessment identified that there were few 

computing facilities on the wards at Time 1. There were a handful of 

computers shared between the 15 wards. As a consequence, access to these 

facilities was erratic. Staff needed to use computers for many aspects of their 

administrative and managerial work, and were often unable to progress 

tasks because of a lack of access to them. Further, it was well recognized 

that communication within such a large and diverse department (comprising 

15 different wards) was difficult: It was felt that providing staff with access to 

intranet and email facilities would significantly improve the flow of 

information within the department. The agreed intervention plan was to 

introduce fully functional computing facilities (an internet-ready computer 

with email, word-processing, and spreadsheet capabilities) in each ward over 

the 6 months after the Time 1 measures. Each participant’s involvement in 

(exposure to) the intervention was determined purely by the progress that 

had been made on the installation of new computer technology. 

Measures 

In both studies data on demographic variables age, gender, length of service, 

and work location were gathered by self-report. A correlate of the emotional 

3 Age and length of service measured using categorical items to protect participant 

anonymity.


experience of work stress (work-related well-being) was measured using the 

exhaustion scale of the General Well-Being Questionnaire (GWBQ; Cox & 

Griffiths, 1995; Cox, Thirlaway, Gotts, & Cox, 1983). This measure was 

used as the dependent variable to examine the likely impact of the 

intervention on work-related well-being. The exhaustion scale is a 12-item 

self-report measure of nonspecific symptoms of general malaise relating to 

fatigue, cognitive confusion, and emotional irritability. It has been shown to 

be sensitive to the fluctuations in well-being associated with the emotional 

experience of stress at work (Cox & Gotts, 1987; Cox et al., 1983). 

Participants recorded their experience of these symptoms using a 5-point 

frequency scale of 0 (never) to 4 (always) with a time window of 

measurement set as the preceding 6 months: The higher the participant’s 

score on the questionnaire, the ‘‘poorer’’ their well-being. The scale was 

found to be reliable in both studies (Cronbach’s alphas: preintervention 

= .82 (Study 1) and .85 (Study 2); postintervention = .89 (Study 1) and 

.83 (Study 2). Examination of the standard errors for skewness and kurtosis 

showed that scores were normally distributed at both measurement points. 

Existing data shows that for employees in managerial posts the normative 

score on this measure is approximately 17 (Cox & Gotts, 1987; Cox et al., 

2000). Preintervention, both participant groups appeared more exhausted 

than the normative group (mean Study 1 = 20.3, SD = 8.7; mean Study 

2 = 18.9, SD = 7.5). These high levels were, in part, justification for both 

risk management projects. 

Measures of intervention exposure were taken at Time 2 and were 

designed to tap into the active ingredient of the each intervention, i.e., the 

aspect of exposure hypothesized to be the driver of change (Kompier, 

Cooper, & Geurts, 2000b) in exhaustion scores. In Study 1, the active 

ingredient was awareness of the new guidelines and procedures for fault 

reporting. It was predicted that being aware of the new guidelines would 

make a difference. Participants were asked to indicate their awareness of 

(exposure to) the intervention through a single dichotomous item: ‘‘Indicate 

whether or not you are aware of: the return of fault reporting to your 

control’’ (0 = ‘‘No’’, 1 = ‘‘Yes’’). To test their reliability and validity these 

data were triangulated. Senior managers (n = 3), and a sample of the 

managers responsible for implementing the intervention (n = 6) were asked 

to comment on its implementation. Records of written communications 

were also examined for evidence of implementation. 

In Study 2, the ‘‘active ingredient’’ was involvement in the programme of 

updating computer equipment in the wards (i.e., a measure of whether each 

participant had used new computer equipment recently installed in their 

ward). Participants were asked to indicate their involvement in the 

intervention through a single dichotomous item: ‘‘Indicate whether or not 

you have been involved in: the use of new computer equipment in your


ward’’ (0 = ‘‘No’’, 1 = ‘‘Yes’’). These data were triangulated. Senior 

managers (n = 4) were asked to identify wards which had received new 

computer equipment. Documents detailing the progress of the installation 

programme were also examined. 

Analysis 

A two-stage analytical procedure was used in both studies. First, the initial 

pre – post design (assuming 100% exposure) was used. Each intervention 

was evaluated through inspection of descriptive data and the use of pair 

sampled t-tests and repeated measures ANOVAs to examine changes in 

well-being over time. Covariates were not considered to allow for more 

liberal testing of intervention effects thus reducing the chance of Type II 

error. Traditionally this design would be used to test for the impact of the 

intervention assuming 100% exposure. 

In the second stage of analysis, exposure to the intervention was included 

as a design variable. A repeated measures ANCOVA was used with one 

between-subject variable (‘‘exposed’’ to the intervention or ‘‘not exposed’’ to 

the intervention) and one within-subject variable (time) with two measurement 

levels (Time 1 and Time 2). Covariates (age and length of service) were 

included because of their potential influence over growth in the dependent 

variable and to control for nonequivalence in the between-group portion of 

the design and reduce the chance of Type I error (Cook & Campbell, 1979; 

Tabachnick & Fiddell, 2001). The impact of exposure to the intervention 

was assessed by examining the significance of the two-way interaction term 

Intervention exposure 6 Time. Interaction effects were then explored in two 

ways. First, paired sample t-tests were used to test for changes in exhaustion 

scores within each group. Second, comparisons between the exposed and not 

exposed groups were also made at both Time 1 and Time 2 using ANOVAs. 

RESULTS 

Study 1 

Exposure to the intervention. The measure of awareness of the 

intervention showed that most (25 supervisors) reported that they had been 

made aware that they had regained control of fault reporting and repair 

authorization. However, a significant proportion (32%) was not aware of 

the change. There were no significant differences between the ‘‘aware/ 

exposed to the intervention’’ and the ‘‘not aware/not exposed to the 

intervention’’ group in terms of demographic details (age, length of service, 

the size of station they were based at), nor their exhaustion scores at Time 1,


F(1, 36) = 1.27, p 4 .05. When triangulated with other methods of process 

evaluation this finding appeared robust. Stakeholder analysis indicated that 

some senior managers had resisted informing staff of the intervention 

because of the budgetary constraints placed on particular groups of stations. 

This was supported by two other pieces of data: (1) There was no evidence 

of the intervention in communication records for some stations, and (2) 

those who were unaware of the intervention shared the same communication 

routes. 

Evaluation of the intervention. Using the traditional nonadaptive pre – 

post study design, there was no significant change in exhaustion scores, 

t = 0.64, p 4 .05; F(1, 36) = 0.24, p 4 .05, over the intervention period (see 

TABLE 1 

Changes in exhaustion score (Study 1, whole sample) 

Preintervention 

(Time 1) 

Postintervention 

(Time 2) 

Mean 

change t F 

Exhaustion score 20.3 (SD = 8.7) 18.9 (SD = 9.4) – 1.4 0.64* 0.24* 

N = 37. 

*p 4 .05. 

TABLE 2 

Repeated measures analysis of covariance (Study 1) 

Type III sum of squares 1 

F 

Within-subjects effects 

Time (pre – post intervention) 67.26 2.22 

Time 6 Length of service 49.03 1.62 

Time 6 Age 97.99 3.24 

Time 6 Exposure to intervention 206.79 6.83** 

Between-subjects effects 

Age 566.21 6.13* 

Length of service 1.79 0.02 

Awareness of the intervention 352.47 3.82 

N = 37. 

*p 5 .05; **p 4 .01. 

1 Used here to take into account the discrepancy in sample size between the group exposed to 

the intervention and the group not exposed to the intervention. 

Box’s M statistic was nonsignificant, F(3, 10402) = 0.68, p 4 .05, and Levene’s test of the 

equality of error variance was nonsignificant: for preintervention exhaustion scores, 

F(1, 35) = 0.19, p 4 .05; for postintervention exhaustion scores, F(1, 35) = 0.05, p 4 .05.


Table 1). This would lead naturally to the conclusion that the intervention 

was ineffective in improving well-being. 

Preanalysis checks confirmed the suitability of the data for repeated 

measures ANCOVA analysis (see Table 2): The usual significance level of 

p 5 .05 could be applied to the testing of effects. The results of the repeated 

measures analysis of covariance are presented in Table 2. After controlling 

for variance in the dependent variable accounted for by age, F(1, 33) = 6.13, 

p 5 .05, the test of within-subjects effects revealed one significant interaction: 

Exposure to the Intervention 6 Time, F(1, 32) = 6.83, p = .01; etasquared 

.17. None of the other within-subject effects (the main effect of time 

and the other interaction effects), were significant, F 4 3.24, p 4 .08. This 

interaction (with adjusted means) is shown in Figure 1. None of the other 

between-subjects effects were significant. 

Table 3 shows the changes in exhaustion scores for the exposed and not 

exposed groups separately. Exploration of the interaction term indicated 

that from similar preintervention worn-out scores, F(1, 36) = 1.27, p 4 .05, 

the scores of the two groups diverged to result in a significant difference in 

postintervention exhaustion scores, F(1, 36) = 10.3, p 5 .01. This divergence 

Figure 1. 

Interaction effect (Study 1: Railway staff).


TABLE 3 

Descriptive statistics for exhaustion scores exploring the interaction term (Study 1) 

Exposed/aware group 

(n = 25) 

Not exposed/not aware group 

(n = 12) 


(Time 1) 


(Time 2) 


(Time 1) 


(Time 2) 

Mean exhaustion 19.3 16.3 22.8 25.3 

scores (range 0 – 48) (SD = 8.2) (SD = 9.1) (SD = 8.1) (SD = 7.6) 

was attributable to a significant drop (3.0 scale points) in exhaustion scores 

in the group exposed to the intervention, t = 2.13, p 5 .05; F(1, 26) = 5.23, 

p 5 .05, alongside a nonsignificant average rise of rise of 2.5 scale points in 

the group not exposed to the intervention, t = – 1.53, p 4 .05; F(1, 

11) = .61, p 4 .05. These results indicated that the group exposed to the 

intervention experienced an improvement in well-being, while well-being in 

the group not exposed to the intervention remained relatively stable. This is 

a very different conclusion from that reached when the measure of 

intervention exposure was not considered in the traditional nonadaptive 

analysis. 

Study 2 

Exposure to the intervention. There was an approximate 50:50 split 

among the group of paediatric nurses in relation to the intervention: Fifteen 

reported having used new using computer facilities on their ward, while 

sixteen reported not having access to new computer facilities on their own 

ward. As in Study 1, there were no significant differences between the 

‘‘involved in/exposed to the intervention’’ and the ‘‘not involved in/not 

exposed to the intervention’’ group in terms of demographic details or 

preintervention exhaustion scores. This finding also appeared robust when 

triangulated. Senior managers reported that progress on installing new 

computers had been slow because of a lack of specialist computer staff 

within the hospital. Only around 60% of the computers had been fully 

installed and were operational. 

Evaluation of the intervention. Table 4 shows that using the traditional 

nonadaptive, pre – post study design there was no significant change in 

exhaustion scores over the intervention period, t = 0.32, p = .75; F(1, 

30) = 0.03, p 4 .85. As in Study 1 this would lead to the conclusion that the 

intervention was ineffective.


TABLE 4 

Changes in exhaustion score (Study 2, whole sample) 


(Time 1) 


(Time 2) 

Mean 

change t F 

Exhaustion score 18.9 19.5 0.6 0.32* 0.03* 

(SD = 7.5) (SD = 6.3) 

N = 31. 

*p 4 .05. 

TABLE 5 

Repeated measures analysis of covariance (Study 2) 

Type III sum of squares 

F 

Within-subjects effects 

Time (pre – post intervention) 0.20 0.01 

Time 6 Length of service 0.79 0.04 

Time 6 Age 0.03 0.03 

Time 6 Involvement in intervention 100.78 4.83* 

Between-subjects effects 

Length of service 0.68 0.01 

Age 2.95 0.04 

Involvement in the intervention 4.73 0.06 

N = 31. 

*p 5 .05. 

Box’s M statistic, F(3, 177953) = 0.48, p 4 .05, and Levene’s test of the equality of error 

variance were nonsignificant: for preintervention exhaustion scores, F(1, 29) = 0.39, p 4 .05; for 

postintervention exhaustion scores, F(1, 29) = 0.01, p 4 .05. 

Preanalysis checks showed that the data was suitable for repeated 

measures ANCOVA analysis (see Table 5). The results of the repeated 

measures analysis of covariance are presented in Table 5. 

No significant adjustments needed to be made to the dependent variable 

before testing the Exposure 6 Time interaction term. This interaction was 

significant, 

F(1, 27) = 4.83, p = .04; eta squared = .16 (see Table 5). None of the 

other within-subject effects (the main effect of time and the other interaction 

effects) were significant, F 4 0.06, p 4 .81. This interaction showed a 

crossover effect and is shown (with adjusted means) in Figure 2. 

Table 6 shows the changes in exhaustion scores for the exposed and 

not exposed groups separately. Exploration of the interaction term 

indicated that there was no significant difference between the two groups


Figure 2. 

Interaction effect (Study 2: Nurses). 

TABLE 6 

Descriptive statistics for exhaustion scores exploring the interaction term (Study 2) 

Exposed/involved group 

(n = 15) 

Not exposed/not involved group 

(n = 16) 


(Time 1) 


(Time 2) 


(Time 1) 


(Time 2) 

Mean exhaustion 20.5 17.4 18.7 20.8 

scores (range 0 – 48) (SD = 7.8) (SD = 7.2) (SD = 6.4) (SD = 5.8) 

exhaustion scores preintervention, F(1, 30) = 0.43, p = .52. After the 

intervention, the scores for the group not involved in the in intervention 

had risen, though not significantly, t = – 1.5, p = .15; F(1, 15) = 1.5, 

p = .24, while the mean exhaustion score for the group involved in the 

intervention had dropped, though again not significantly, t = 1.74, 

p = .10; F(1, 14) = 3.10, p = .10. After the intervention the group


involved in the intervention reported lower exhaustion scores than the 

group not involved in the intervention, but this difference only 

approached significance, F(1, 30) = 2.24, p = .14. 

Taken together, however, the effect of the changes within the two groups 

was significant: Interaction terms are more sensitive than separate withingroups 

analysis (Tabachnick & Fidell, 2001). As in Study 1, these results 

suggested that exposure to the intervention impacted on participants’ wellbeing: 

This was only apparent when measures of exposure were used to 

adapted the design of the study during analysis. 

DISCUSSION 

The significant Measured exposure 6 Time interaction effects in both 

studies indicated that exposure to the intervention predicted changes in 

exhaustion scores over time. The central hypothesis of this article was 

confirmed: Nonadaptive designs underestimated the impact of the intervention 

on well-being with significant change only becoming apparent when 

adapted study designs were used. Both studies indicated that Type III error 

(Dobson & Cook, 1980; Harachi et al., 1999) may be minimized by using the 

results of a robust process evaluation (in this study one that centred on the 

triangulation of exposure data) to adapt outcome evaluation. This 

protection against Type III error is particularly important given that the 

psychological components of work design may exert a relatively modest 

influence over general well-being in the short term (Zapf, Dormann, & 

Frese, 1996). Moreover, in both studies measures of intervention exposure 

identified hidden and ‘‘unintended’’ between-groups designs that facilitated 

much-needed and significant improvements in methodological adequacy (see 

Beehr & O’Hara, 1987; Murphy, 1996). In Study 1, the between-group 

differences found at Time 2 reflected a worsening of the situation for the 

‘‘not aware’’ group cooccurring with stability in the ‘‘aware’’ group, 

suggesting that the return of responsibilities to them protected supervisors 

from the effects of problems associated with not being able to report faults. 

This ‘‘protective effect’’ has been observed in other intervention studies (e.g., 

Terra, 1995). The pattern of change in Study 2 indicated a significant 

intervention effect when the small changes in the intervention and control 

groups cooccurred. 

Clearly, and for good reasons, the methodological adequacy of both 

studies reflected the constraints placed on them by the research setting. Like 

almost all stress management intervention evaluation studies they do not 

play exactly by the methodological rules (Kompier et al., 2000a). However, 

measuring and capitalizing on uncontrolled and unpredictable exposure 

patterns did extend the established principles of the ‘‘natural experiment’’. 

Indeed, in the majority of situations where complete control over


intervention exposure is not possible, this approach offers a practical and 

informative method of evaluation. Of course, it should only be used when 

rigorous process evaluation yields a strong case for adapting the study 

design: The methodology should not be used as a justification for ‘‘fishing’’ 

in data for significant results (Yin, 1994). The small sample sizes provided by 

the majority of organizations (as was the case in both of the studies 

presented here) make theory-driven analysis and preanalysis checks on the 

data particularly important. 

Naturally, this approach is not without its problems. Unpredictable 

exposure patterns create an extreme form of nonequivalent study design and 

may be affected by selection biases (Campbell & Stanley, 1963; Cook & 

Campbell, 1979). In both of the studies presented here, intervention and 

control groups were comparable across a range of demographic variables, 

and appeared well-matched at Time 1 on the measure of the dependent 

variable. However, when this is not the case possible bias can be controlled 

for by the analysis of covariates, or by using ‘‘blocking’’ designs that match 

cases on an ad hoc basis (Cook & Campbell, 1979; Cook & Shadish, 1994). 

Further, the list of active ingredients (Kompier & Kristensen, 2000) that are 

indicative of intervention exposure across a range of different types of 

intervention is not yet established. Measuring involvement may be 

appropriate for interventions that require the participant to actively engage 

in the intervention for it to work (e.g., a training intervention, teambuilding 

intervention, staff consultation process, etc.). Measuring awareness may be 

more appropriate for interventions that have a more passive mechanism (e.g., 

the receipt of information, changes in guidelines, or a redefinition of roles 

and responsibility). Further work is needed to establish the modus operandi 

of a variety of organizational interventions in order to ensure that 

appropriate measures of intervention exposure are used to adapt study 

designs. 

Triangulating the self-report data using process evaluation helped to 

guard against threats to the validity of the self-report intervention exposure 

measure, and greatly enhanced understanding of the intervention effects. In 

Study 1, it appeared that staff who were managed by particular senior 

managers had not received the intervention. In Study 2, exposure to the 

intervention was determined by the speed at which the hospital’s computer 

technicians could install and make fully operational the computer 

equipment. These findings indicated that reported exposure to the 

intervention was driven by implementation factors rather than by any 

individual differences (e.g., negative affect: Brief, Burke, George, Robinson, 

& Webster, 1988; Watson & Clark, 1984) or methodological artifacts (e.g., 

common method variance; Spector, 1994). Even in controlled exposure 

studies such data about the change process should be gathered to enhance 

the explanatory yield of outcome evaluation (Cook & Shadish, 1994).


As has been recommended, semistructured interviews were used alongside 

the questionnaire surveys to identify the mechanisms of change 

(Griffiths, 1999; Kompier et al., 2000b; Yin, 1995). Analysis of the data 

from the 12 interviews carried out with participants in Study 1 indicated that 

the intervention increased participation in decision making, improved 

control over the management of the station environment, and enhanced 

control over the allocation of work in the station they managed. In the 14 

interviews carried out with paediatric nurses a wide variety of mechanisms 

were identified. These included: an increase in uninterrupted time to deal 

with administrative work and with tasks requiring concentrated thought; 

being better able to meet project deadlines; having more control over the 

management of one’s own time; and a reduction in the amount of work 

completed at home. Almost all of the changes mentioned have been shown 

elsewhere to have an impact on work-related well-being (Bond & Bunce, 

2001; Heaney & Goetzel, 1997; Jackson, 1983; Kompier et al., 2000a; 

Kompier & Cooper, 1999; Landsbergis & Vivona-Vaughan, 1995; Mikkelsen 

et al., 2000; Parker & Wall, 1998; Parkes & Sparkes, 1998; Schaubroeck, 

Ganster, Sime, & Ditman, 1993). The variety of mechanisms mentioned in 

participants’ discourses also indicates that the same intervention may 

operate through multiple or different mechanisms, and is a finding that 

merits further investigation (Meijmen, Mulder, & Cremer, 1992; Randall, 

2002). 

In conclusion, adapted designs built around process evaluation appeared 

to offer a means of strengthening the evaluation of stress management 

intervention in complex and unpredictable environments. Combining 

process and outcome evaluation in this way offers a rigorous quantitative 

evaluation of outcomes in situations hostile to study designs built around 

controlled exposure patterns. In time, this has the potential to enable a 

larger and more informative evaluation research literature to emerge. 

REFERENCES 

Beehr, T. A., & O’Hara, K. (1987). Methodological designs for the evaluation of occupational 

stress interventions. In S. V. Kasl & C. L. Cooper (Eds.), Stress and health: Issues in research 

methodology (pp. 79 – 112). Chichester, UK: Wiley. 

Bond, F., & Bunce, D. (2001). Job control mediates change in a work reorganization 

intervention for stress reduction. Journal of Occupational Health Psychology, 6, 290 – 302. 

Brief, A. P., Burke, M. J., George, J. M., Robinson, B. S., & Webster, J. (1988). Should negative 

affectivity remain an unmeasured variable in the study of job stress? Journal of Applied 

Psychology, 73, 193 – 198. 

Briner, R., & Reynolds, S. (1999). The costs, benefits and limitations of organizational level 

stress interventions. Journal of Organizational Behavior, 20, 647 – 664. 

Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. 

Psychological Bulletin, 54, 297 – 312.


Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for 

research. Chicago: Rand-McNally. 

Colarelli, S. M. (1998). Psychological interventions in organizations: An evolutionary 

perspective. American Psychologist, 53, 1044 – 1056. 

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for 

field settings. Chicago: Rand McNally. 

Cook, T. R., & Shadish, W. R. (1994). Social experiments: Some developments over the last 

fifteen years. Annual Review of Psychology, 45, 545 – 580. 

Cooper, C. L., Liukkonen, P., & Cartwright, S. (1996). Stress prevention in the workplace: 

Assessing the costs and benefits to organisations. Dublin, Ireland: European Foundation for 

the Improvement of Living and Working Conditions. 

Cox, T. (1993). Stress research and stress management: Putting theory to work. Sudbury, UK: 

HSE Books. 

Cox, T., & Gotts, G. (1987). The General Well-Being Questionnaire manual. Nottingham, UK: 

Department of Psychology, University of Nottingham. 

Cox, T., & Griffiths, A. J. (1995). The nature and measurement of work stress: Theory and 

practice. In J. Wilson & N. Corlett (Eds.), The evaluation of human work: A practical 

ergonomics methodology. London: Taylor & Francis. 

Cox, T., Griffiths, A. J., Barlow, C. A., Randall, R. J., Thomson, L. E., & Rial-Gonzalez, E. 

(2000a). Organisational interventions for work stress. Sudbury, UK: HSE Books. 

Cox, T., & Griffiths, A. J., & Randall, R. (2002a). The assessment of psychosocial hazards at 

work. In M. J. Schabracq, J. A. M. Winnubst, & C. L. Cooper (Eds.), Handbook of work and 

health psychology (pp. 191 – 207). Chichester, UK: Wiley & Sons. 

Cox, T., Griffiths, A. J., & Rial-Gonzalez, E. (2000b). Research on work-related stress. 

Luxembourg: Office for Official Publications of the European Communities. 

Cox, T., Randall, R., & Griffiths, A. (2002b). Interventions to control stress at work in hospital 

staff. Sudbury, UK: HSE Books. 

Cox, T., Thirlaway, M., Gotts, G., & Cox, S. (1983). The nature and assessment of general wellbeing. 

Journal of Psychosomatic Research, 27, 353 – 359. 

Dobson, L. D., & Cook, T. J. (1980). Avoiding Type III error in program evaluation: Results 

from a field experiment. Evaluation and Program Planning, 3, 269 – 376. 

European Commission. (1989). Council framework directive on the introduction of measures to 

encourage improvements in the safety and health of workers at work 89/391/EEC. Official 

Journal of the European Communities, 32( L183), 1 – 8. 

Fitzgerald, J., & Rasheed, J. M. (1998). Salvaging an evaluation from the swampy lowland. 

Evaluation and Program Planning, 21, 199 – 209. 

Griffiths, A. (2003). Actions at the workplace to prevent work stress. Science in Parliament, 60, 

12 – 13. 

Griffiths, A. J. (1999). Organizational interventions: Facing the limits of the natural science 

paradigm. Scandinavian Journal of Work, Environment and Health, 25, 589 – 596. 

Griffiths, A. J., Cox, T., & Barlow, C. A. (1996). Employers’ responsibilities for the assessment 

and control of work-related stress: A European perspective. Health and Hygiene, 17, 62 – 70. 

Harachi, T. W., Abbot, R. D., Catalano, R. F., Haggerty, K. P., & Fleming, C. B. (1999). 

Opening the black box: Using process evaluation measures to assess implementation and 

theory building. American Journal of Community Psychology, 27, 711 – 731. 

Hartley, J. (2002). Organizational change and development. In P. Warr (Ed.), Psychology at 

work (pp. 399 – 425). London: Penguin. 

Health and Safety Commission. (1999). Management of health and safety regulations. London: 

Her Majesty’s Stationery Office.


Heaney, C. A., & Goetzel, R. Z. (1997). A review of health-related outcomes of multicomponent 

worksite health promotion programs. American Journal of Health Promotion, 

11, 290 – 308. 

Heaney, C. A., Israel, B. A., Schurman, S. J., Baker, E. A., House, J. S., & Hugentobler, M. 

(1993). Industrial relations, worksite stress reduction, and employee well-being: A 

participatory action research investigation. Journal of Organizational Behavior, 14, 495 – 510. 

Ivancevich, J. M., Matteson, M. T., Freedman, S. M., & Phillips, J. S. (1990). Worksite stress 

management interventions. American Psychologist, 45, 252 – 261. 

Jackson, S. (1983). Participation in decision-making as a strategy for reducing job-related 

strain. Journal of Applied Psychology, 68, 3 – 19. 

Kompier, M., de Gier, E., Smulders, P., & Draaisma, D. (1994). Regulations, policies and 

practices concerning work stress in five European countries. Work and Stress, 8, 296 – 318. 

Kompier, M. A. J., Aust, B., van den Berg, A., & Siegrist, J. (2000a). Stress prevention in bus 

drivers: Evaluation of 13 natural experiments. Journal of Occupational Health Psychology, 5, 

11 – 31. 

Kompier, M. A. J., & Cooper, C. L. (Eds.). (1999). Preventing stress, improving productivity: 

European case studies in the workplace. London: Routledge. 

Kompier, M. A. J., Cooper, C. L., & Geurts, S. A. E. (2000b). A multiple case study approach 

to work stress prevention in Europe. European Journal of Work and Organizational 

Psychology, 9, 371 – 400. 

Kompier, M. A. J., & Kristensen, T. (2000). Organisational work stress interventions in a 

theoretical, methodological and practical context. In J. Dunham (Ed.), Stress in the 

workplace: Past, present and future. London: Whurr Publishers. 

Landsbergis, P. A., & Vivona-Vaughan, E. (1995). Evaluation of an occupational stress 

intervention in a public agency. Journal of Organizational Behavior, 16, 29 – 48. 

Lipsey, M. W. (1996). Key issues in intervention research: A programme evaluation perspective. 

American Journal of Industrial Medicine, 29, 298 – 302. 

Lipsey, M. W., & Corday, D. S. (2000). Evaluation methods for social intervention. Annual 

Review of Psychology, 51, 345 – 375. 

Meijman, T., Mulder, G., & Cremer, R. (1992). Workload of driving examiners: A psychosocial 

field study. In H. Kragt (Ed.), Enhancing industrial performance: Experiences of integrating 

the human factor. London: Taylor & Francis. 

Mikkelsen, A., Saksvik, P. O., & Landsbergis, P. (2000). The impact of a participatory 

organizational intervention on job stress in community health care institutions. Work and 

Stress, 14, 156 – 170. 

Murphy, L. R. (1996). Stress management in work settings: A critical review of health effects. 

American Journal of Health Promotion, 11, 112 – 135. 

Nytro, K., Saksvik, P. O., Mikkelsen, A., Bohle, P., & Quinlan, M. (2000). An appraisal of key 

factors in the implementation of occupational stress interventions. Work and Stress, 14, 

213 – 225. 

Parker, S., & Wall, T. (1998). Job and work design: Organizing work to promote well-being and 

effectiveness. Thousand Oaks, CA: Sage. 

Parkes, K. R., & Sparkes, T. J. (1998). Organizational interventions to reduce work stress: Are 

they effective? A review of the literature. Sudbury, UK: HSE Books. 

Randall, R. J. (2002). Organisational interventions to manage work-related stress: Using 

organisational reality to permit and enhance evaluation. Unpublished PhD thesis, University 

of Nottingham, UK. 

Reynolds, S. (1997). Psychological well-being at work: Is prevention better than cure? Journal of 

Psychosomatic Research, 43, 93 – 102.


Saksvik, P. O., Nytro, K., Dahl-Jorgensen, C., & Mikkelsen, A. (2002). A process evaluation of 

individual and organizational occupational stress and health interventions. Work and Stress, 

16, 37 – 57. 

Schaubroeck, J., Ganster, D. C., Sime, W. E., & Ditman, D. (1993). A field experiment testing 

supervisory role clarification. Personnel Psychology, 46, 1 – 25. 

Semmer, N. (2003). Job stress interventions and the organization of work. In J. Quick & L. 

Tetrick (Eds.), A handbook of occupational health psychology (pp. 325 – 353). Washington, 

DC: American Psychological Association. 

Spector, P. E. (1994). Using self-report questionnaires in OB research: A comment on the use of 

a controversial method. Journal of Organizational Behavior, 15, 385 – 392. 

Tabachnick, B., & Fidell, L. (2001). Using multivariate statistics (3rd ed.). New York: 

HarperCollins. 

Terra, N. (1995). The prevention of job stress by redesigning jobs and implementing selfregulating 

teams. In L. R. Murphy, J. J. Hurrell, S. L. Sauter, & G. P. Keita (Eds.), Job 

stress interventions (pp. 265 – 281). Washington, DC: American Psychological Association. 

Van der Hek, H., & Plomp, H. N. (1997). Occupational stress management programmes: A 

practical overview of published effect studies. Occupational Medicine, 47, 133 – 141. 

Watson, D., & Clark, L. A. (1984). Negative affectivity: The disposition to experience aversive 

emotional states. Psychological Bulletin, 96, 465 – 490. 

Yin, R. K. (1994). Case study research: Design and methods (2nd ed.). Thousand Oaks, CA: 

Sage. 

Yin, R. K. (1995). New methods for evaluating programs in NSF’s Division of Research, 

Evaluation and Dissemination. In J. A. Frechtling (Ed.), Footprints: Strategies for nontraditional 

program evaluation (pp. 25 – 36). Arlington, VA: National Science Foundation. 

Yin, R. K., & Kaftarian, S. J. (1997). Introduction: Challenges of community-based program 

outcome evaluations. Evaluation and Program Planning, 20, 293 – 297. 

Zapf, D., Dormann, C., & Frese, M. (1996). Longitudinal studies in organizational stress 

research: A review of the literature with reference to methodological issues. Journal of 

Occupational Health Psychology, 1, 145 – 169. 

Manuscript received December 2003 

Revised manuscript received July 2004

Evaluating organizational stress-management interventions using ...

Create successful ePaper yourself

Delete template?

Save as template?