Repeated Measures ANOVA

Repeated-Measures 

Analysis of Variance

Types of Within-Subject Designs 

PTrue “Within-Subjects” Design 

Each subject is measured under each treatment 

condition 

E.g., effects of amount of background noise on a 

memory task (e.g., no background noise, 10 db, 

20db) 

PRepeated Measures Design 

Each subject is measured at two or more points in 

time 

E.g., effects of exercise on heart rate (rest, after 1 

min on treadmill, after 5 min on treadmill)

Types of Within-Subject Designs 

PProfile Analysis 

Scores on different tests (DVs), which are 

comparably scaled, are compared 

E.g., comparing scores on scales of the MMPI 

PMatched Subjects Designs 

A type of within-subject design where instead of 

having subjects participate in all levels of the IV, 

different subjects, which are matched a priori on 

relevant variables, are compared 

E.g., do subjects differ in reading speed under 

different types of lighting, after matching subjects 

on a priori reading speed

The Logical Background for a 

Repeated-Measures ANOVA 

# The repeated measures analysis of 

variance applies to research situations 

using within-subject designs 

Including repeated measures designs, profile 

analysis and matched subject designs 

# Some of the logic and formulae for the 

repeated measures ANOVA are identical 

to the independent measures ANOVA 

However, the repeated measures ANOVA 

includes a second stage of analysis in which 

variability due to individual differences is 

removed from the error term.

Individual Differences 

Between Subjects 

# The repeated measures design 

automatically eliminates individual 

differences from the between treatments 

variability because the same subjects are 

used in every condition 

# Further, individual differences (which can 

be quantified) are eliminated from the 

denominator of the F test 

# The result is a test statistic similar to the 

independent measures F ratio but with all 

individual differences removed

Comparing Independent- 

Measures and Repeated- 

Measures ANOVA 

# When individual differences are removed 

from the denominator of the F test, what 

results is a more ‘powerful’ test of the 

research hypothesis (i.e., that the means 

differ) 

In other words, the denominator is smaller 

# This advantage can be very important in 

situations where large individual 

differences would otherwise obscure the 

treatment effect in an independentmeasures 

study

Variability in a Within-Subjects / 

Repeated Measures Design 

PSubjects - variability due to consistent 

differences between the subjects 

Not usually an important effect to analyze since it 

only shows that subjects differ on the DV 

PTreatment - variability due to differences 

between the levels of the DV 

PError - Interaction between the subjects and 

levels of the IV 

Or in G&W terms, the overall within-subject 

variability minus the subject variability

Why is the Treatment X Subjects 

Interaction an Appropriate Error 

Term? 

PMore consistent scores of subjects across 

the levels of the IV results in less error (more 

confidence that the treatment was effective) 

100 

80 

60 

40 

20 

0 

1st Qtr 

2nd Qtr 

80 

70 

60 

50 

40 

30 

20 

10 

0 

1st Qtr 

2nd Qtr 

s1 s2 s3 s4 

s1 s2 s3 s4 

s1 10 40 

s2 20 50 

s3 50 100 

s4 15 40 

s1 10 40 

s2 20 30 

s3 50 30 

s4 15 80

Understanding the F-ratio

Null and Alternate Hypotheses 

# The null hypothesis is that the means are 

all equal 

H o : ì 1 = ì 2 = ... = ì k 

For example, with three groups: H o : ì 1 = ì 2 = ì 3 

# The alternative hypothesis is that at least 

one of the means is different from another 

Again, H o : ì 1 ì 2 ... ì k would not be an 

acceptable way to write the alternate hypothesis

Computation of the F ratio 

Total Variability and Degrees of Freedom


Error Variability and Degrees of Freedom


Between Treatment Variability and Degrees of Freedom


Mean Squares and F test 

P Note that H o is rejected if: 

F F á, df(Between Treatments), df(Error)

Measuring Effect Size for the 

Repeated-Measures Analysis of 

Variance 

# In addition to determining whether the 

mean differences are significant with a 

hypothesis test, it is also recommended 

that you determine the size of the mean 

differences by computing a measure of 

effect size 

The common technique for measuring effect 

size for an analysis of variance is to compute 

the percentage of variance that is accounted for 

by the treatment effects

Measuring Effect Size for the 

Repeated-Measures Analysis of 

Variance (cont.) 

# In the context of ANOVA this percentage 

is identified as ç 2 

Before computing ç 2 , however, it is customary to 

remove variability due to individual differences 

between the subjects

Assumptions of the One-Way 

Within-Subjects Design 

PSubjects are randomly and independently 

selected 

PScores in each treatment condition are 

normally distributed 

PHomogeneity of Variances & Covariances 

(Compound Symmetry) / Sphericity

Sphericity 

P Sphericity is violated as the treatment 

difference variances are unequal

Notes on the Sphericity 

Assumption 

PLike the homogeneity of variance assumption 

for between subjects designs, sphericity is 

commonly violated 

PViolations of the assumption of sphericity can 

severely bias the F statistic 

More specifically, when the sphericity assumption 

is violated the F test becomes too liberal, or in 

other words, the probability of a Type I error 

becomes much larger than á

Options for Counteracting the 

Effects of Violations of Sphericity 

PAdjusted df tests 

Reduce the number of numerator and 

denominator degrees of freedom to increase the 

size of the critical value (or reduce the size of the 

p-value) and thus reduce the number of Type I 

errors 

Greenhouse-Geisser 

– Calculates the degree to which the sphericity 

assumption is violated and then adjusts the degrees of 

freedom accordingly 

– The Greenhouse-Geisser adjustment to the ANOVA F 

test is included as routine output in SPSS

Psych Grads Example 

PDr. White would like to know if the “social life” 

of York Psychology students changes as a 

function of the number of years they have 

been in the program (á=.01) 

PH o : ì First Year = ì Second Year = ì Third Year 

PH 1 : There are no differences among the 

means

Example

Example 

Error SS and MS

Example 

Between Treatment SS and MS

Example 

F test and Conclusions 

PF crit (á=.01, df between-treatment = 2, df within-treatment = 16) = 6.23 

PTherefore, since F > F crit we reject the null 

hypothesis and conclude that there is a 

significant difference between 1st, 2nd and 

3rd year students in terms of the quality of 

their social life

Effect Size 

P Therefore, .763 (or 76.3%) of the variability in 

social life scores can be attributed to the year 

of the program 

This would definitely be a large effect 

But ... this is made up data so don’t try too hard 

to make sense of the results

Post Hoc Tests 

P Somehow Gravetter & Wallnau forgot to 

mention how you are supposed to know 

exactly where differences among the 

conditions exist 

PIn fact, it is very simple because we just use 

multiple paired t-tests to determine exactly 

which groups differ 

The reason we use a “separate” test (specifically 

error term) for each pairwise comparison is that 

the problems with sphericity are very severe, but 

they do not affect “two group” comparisons

Post Hoc Tests 

P1st year vs 2nd year 

t (8) = 1.00, p = .347 

Social lives do not differ between 1st and 2nd 

year students 

P1st year vs 3rd year 

t (8) = 7.23, p < .001 

Social lives are better in 3rd year than in 1st year 

P1st year vs 2nd year 

t (8) = 6.93, p < .001 

Social lives are better in 3rd year than in 2nd year

Repeated Measures ANOVA

Create successful ePaper yourself

Delete template?

Save as template?