08.11.2014 Views

ABOUT THE ANOVA TEST

ABOUT THE ANOVA TEST

ABOUT THE ANOVA TEST

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 6: Analysis of Variance<br />

1<br />

Chaptter 6::<br />

ANALSYIIS OF VARIIANCE<br />

Upon completion of this chapter, you should be able to:<br />

<br />

<br />

<br />

<br />

<br />

<br />

define the OneWay <strong>ANOVA</strong><br />

explain the logic of OneWay <strong>ANOVA</strong><br />

compute Oneway <strong>ANOVA</strong> using a formula<br />

identify the assumptions for using the OneWay <strong>ANOVA</strong><br />

compute Oneway <strong>ANOVA</strong> using SPSS<br />

interpret the OneWay <strong>ANOVA</strong> output using an online calculator<br />

<br />

<br />

<br />

<br />

<br />

CHAPTER OVERVIEW<br />

About the Oneway Anova test<br />

The logic of the Oneway Anova<br />

Computing the F-test<br />

Assumptions for using the Oneway<br />

Anova<br />

Example: Using the SPSS to<br />

compute the Oneway <strong>ANOVA</strong><br />

Chapter 1: Introduction<br />

Chapter 2: Descriptive Statistics<br />

Chapter 3: The Normal Distribution<br />

Chapter 4: Hypothesis Testing<br />

Chapter 5: T-test<br />

Chapter 6: Oneway Analysis of Variance<br />

Chapter 7: Correlation<br />

Chapter 8: Chi-Square<br />

This chapter introduces you to the Oneway Anova beginning with its logic and the<br />

formula that explains the statistical tool. You do not need to remember the formula but<br />

rather the meaning of the formula. The tool is often used to test the differences between<br />

two or more means. You should give due consideration to the assumptions underlying the<br />

use of the statistical tool. SPSS provides a convenient and quick way to compute the F-<br />

statistic which will tell you whether the differences between means is significant.


Chapter 6: Analysis of Variance<br />

2<br />

About the <strong>ANOVA</strong> Test<br />

In educational research, we often are involved in finding out whether there are<br />

differences between groups. For example, is there a difference between males and<br />

female, between rural and urban students and so forth. As we discussed in Chapter 5,<br />

the t-test was used to compare differences of means between two groups, such as<br />

comparing outcomes between a control and treatment group in an experimental<br />

study. Suppose you are interested in comparing the means of three groups (i.e k = 3)<br />

rather than two.<br />

You might be tempted to use multiple t-test and compare the means<br />

separately; i.e. you compare the means of Group 1 and 2 followed by Group 1 and 3<br />

and so forth. What is the danger of doing this? Multiple t-tests enhances the likelihood<br />

of committing Type 1 error (i.e. Claiming that two means are not equal when in fact<br />

they are equal). In other words, you reject a null hypothesis when it is TRUE. On a<br />

practical level, using the t-test to compare many means is a cumbersome process in<br />

terms of the calculations involved.<br />

EXAMPLE:<br />

Let us look at this example which shows the results of a study on Attitude Towards<br />

Homework among students of varying ability levels. Subjects were divided into three<br />

groups: High Ability, Average Ability and Low Ability. The total sample size is 505<br />

students. You need a special class of statistical techniques called the OneWay<br />

Analysis of Variance or OneWay <strong>ANOVA</strong> which we will discuss here.<br />

ATTITUDES TOWARDS HOMEWORK AMONG 14 YEAR OLD STUDENTS<br />

Group N Mean Std. Dev. Std. Error 95 Pct Conf. Int for Mean<br />

High ability 220 13.03 3.17 0.12 12.79 TO 13.27<br />

Average ability 212 11.99 2.93 0.11 11.77 TO 12.21<br />

Low ability 73 9.54 3.50 0.40 8.73 TO 10.36<br />

Interpretation of the Table Above<br />

<br />

What do the three Means tell you? High ability students have the highest<br />

mean (13.03), while low ability students have the lowest mean (9.54). Average<br />

ability students fall in the middle, with a mean of 11.99.<br />

<br />

What do the three Standard Deviations tell you? Note that the standard<br />

deviation for high ability (3.17) and average ability (2.93) students is fairly


Chapter 6: Analysis of Variance<br />

3<br />

close while low ability students have a somewhat bigger standard deviation of<br />

3.50.<br />

<br />

<br />

<br />

What do the three Standard Errors tell you? Refer to the table, and you<br />

will notice that that there is a column called 'standard error'. What is the<br />

standard error? The standard error is a measure of how much the sample<br />

means vary if you were to take repeated samples from the same population.<br />

The first two groups contain > 200 students each, the standard error of the<br />

mean for each of these groups is fairly small. It is 0.12 for high ability students<br />

and 0.11 for average ability students. However, the standard error for the low<br />

ability group is comparatively high = 0.40. Why? The smaller number of low<br />

ability students (n=73) and the larger standard deviation explains why the<br />

standard error is larger.<br />

What does 95 Pct Conf. Int for Mean? The last column displays the<br />

'confidence interval'. What is the confidence interval? It is the range which is<br />

likely to contain the true population value or mean. If you take repeated<br />

samples of 14 year students from the same population of 14 year old students<br />

in the country and calculate their mean, there is a probability that 95% of them<br />

should include the unknown population value or mean. For example, you can<br />

be 95% confident that, in the population, the mean of high ability students is<br />

somewhere between 12.79 and 13.27. Similarly, you can be 95% confident<br />

that, in the population, the mean of low ability students is somewhere between<br />

8.73 and 10.36.<br />

You will notice that the confidence interval is wider for low ability students<br />

(i.e. 1.63) compared to confidence interval for high ability students (i.e. 0.48).<br />

Why? This is due to the larger standard error (0.40) obtained by low ability<br />

students. Since, the confidence interval depends on the standard error of the<br />

mean, the confidence interval for low ability students is wider than for high<br />

ability students. So, the larger the standard error, the wider will be the<br />

confidence interval. Makes sense doesn't it!<br />

At the heart of <strong>ANOVA</strong> is the concept of VARIANCE. What is variance? Most of<br />

you would say, it the standard deviation squared!. Yes, that is correct. Focus is on two<br />

types of variance:<br />

<br />

<br />

Between-Group Variance, i.e. If there are THREE groups, it is the variance<br />

between the three groups.<br />

Within-Group Variance, i.e. If in each group there are 30 subjects, it is the<br />

variance of scores within subjects in that group.<br />

The F-value is a ratio of the Between-Group<br />

Variance and Within-Group Variance


Chapter 6: Analysis of Variance<br />

4<br />

If the F-value is significant, it tells us that the population means are probably NOT<br />

ALL EQUAL and you reject the null hypothesis. Next, you have to locate where the<br />

significant lies or which of the means are significantly different. You have to use poshoc<br />

analysis to determine this.<br />

LEARNING ACTIVITY<br />

a) When would you use Oneway <strong>ANOVA</strong> and not the t-test<br />

to compare means?<br />

b) What is the standard error? Why does the standard error<br />

vary?<br />

c) Explain "95 Pct Conf. Int for Mean".<br />

The Logic of the ONEWAY <strong>ANOVA</strong><br />

A researcher was interested in finding out whether there are differences in<br />

creative thinking among 12 year students from different socio-economic backgrounds.<br />

Creative thinking was measured using The Torrance Test of Creative Thinking<br />

consisting of 5 items while socio-economic status (SES) was measured using<br />

household income. Socio-economic status or SES was divided into 3 Groups (High,<br />

Middle and Low). The null hypothesis generated is that all three groups will have the<br />

same mean score on the creative test. In formula terms, if we use the symbol μ<br />

[pronounced as ‘mew’] to represent the average score, the null hypothesis is<br />

expressed through the following notation:<br />

The NULL HYPO<strong>THE</strong>SIS is represented as follows:<br />

Ho : μ1 = μ2 = μ3<br />

Mean<br />

4.00<br />

High SES Middle SES Low SES


Chapter 6: Analysis of Variance<br />

5<br />

The null hypothesis states that the mean of high ability, average ability and low ability<br />

students is the same; i.e. is equal to 4.00<br />

To test the null hypothesis, the Oneway Analysis of Variance is used. The Oneway<br />

<strong>ANOVA</strong> is a statistical technique used to test the null hypothesis that several<br />

populations means are equal. The word 'variance' is used because it examines the<br />

variability in the sample. In other words, how much do the scores of individual<br />

students vary from the mean. Based on the variability or variance, it determines<br />

whether there is reason to believe that the population means are not equal. In our<br />

example, does creativity vary between the three groups of 12 year old students.<br />

The ALTERNATIVE HYPO<strong>THE</strong>SIS is represented as follows:<br />

Ha: µ1 ≠ µ2 ≠ µ2<br />

Mean<br />

4.12 4.37<br />

4.00<br />

3.85<br />

‘<br />

High SES Middle SES Low SES<br />

The alternative hypothesis states that there is a difference between the three<br />

groups of students (see Figure above). However, the alternative hypothesis does not<br />

state which groups differ from one another. It just says that the means of each group<br />

are not all the same; or at least one of the groups differs from the others.<br />

Are the means really different? We need to figure out whether the observed<br />

differences in the sample means are attributed to just the natural variability among<br />

sample means or whether there is reason to believe that the three groups of students<br />

have different means in the population. In other words, are the differences due to<br />

chance or there is a 'real' difference.


Chapter 6: Analysis of Variance<br />

6<br />

LEARNING ACTIVITY<br />

a) What does the word 'variance' mean in the Oneway<br />

Analysis of Variance?<br />

b) Why would you use the Oneway <strong>ANOVA</strong> rather than<br />

the t-test?<br />

BETWEEN GROUP AND WITHIN GROUP VARIANCE<br />

As mentioned earlier, the researcher was interested in determining whether<br />

there were differences in creativity between students from different socio-economic<br />

backgrounds; i.e. High SES, Middle SES and Low SES. To determine if there are<br />

significant differences between the three means, you have to compute the F-ratio or F-<br />

test. To compute the F-ratio you have to use two types of variances:<br />

<br />

<br />

Between Group Variance or the variability between group means.<br />

Within Group Variance or the variability of the observations (or scores) within<br />

a group (around a particular group's mean)<br />

a) Between Group Variance<br />

The diagram above presents the results of the study. Let us look more closely at the<br />

two types of variability or variance. Note that each of the three groups has a mean<br />

which is also known as the sample mean.<br />

<br />

<br />

<br />

The high SES group has a mean of 4.12 for the creativity test<br />

The middle SES group has a mean of 4.37 for the creativity test<br />

The low SES had a mean of 3.85 for the creativity test.<br />

b) Within Group Variance<br />

Within group variance or variability is a measure of how much the<br />

observations or scores within a group vary. It is simply the variance of the<br />

observations or scores within a group or sample, and it is used to estimate the variance<br />

within a group in the population. Remember, <strong>ANOVA</strong> requires the assumption that all<br />

of the groups have the same variance in the population. Since you do not know if all<br />

of the groups have the same mean, you cannot just calculate the variance for all of the<br />

cases together. You must calculate the variance for each of the groups individually<br />

and then combine these into an "average" variance.<br />

Within group variance for the example shows that the 313 students within the<br />

high SES group have different scores, the 297 students within the middle SES group<br />

have different scores and the 340 students within the low SES also have different


Chapter 6: Analysis of Variance<br />

7<br />

scores. Among the three groups, there is slightly greater variability or variance among<br />

Low SES subjects (SD = 1.31) compared to High SES subjects with a SD of 1.28.<br />

LEARNING ACTIVITY<br />

a) What do you mean by "between group variance"?<br />

b) What do you mean by "within group variance"?<br />

Computing the F-Test<br />

The F-test or the F-ratio is a measure of how different the means are relative to<br />

the variability or variance within each sample. The larger the F value, the greater the<br />

likelihood that the differences between means are due to something other than chance<br />

alone; i.e. real effects or the means are significantly different from one another.<br />

Below is the summarised formula for computing the F statistic or F-ratio:<br />

F =<br />

Between Mean Square<br />

Within Mean Squares<br />

Based on the study (see the Table for results) about the relationship between<br />

creativity and socio-economic status of subject, computation of the F-statistics is as<br />

follows:<br />

High SES Middle SES Low SES<br />

Mean = 4.12 4.37 3.85<br />

SD = 1.28 1.30 1.31<br />

n = 313 297 340


Chapter 6: Analysis of Variance<br />

8<br />

STEPS FOR COMPUTING <strong>THE</strong> F STATISTICS OR F-RATIO:<br />

Step 1:<br />

Computation of the Between Sum of Squares (BSS)<br />

The first step is to calculate the variation between groups by comparing the<br />

mean of each SES group with the Mean of the Overall Sample (the mean score on the<br />

test for all students in this sample is 4.00).<br />

BSS = n1 ( x1 – x)² + n2 (x2 – x)² + n3 (x2 – x)²<br />

This measure of between-group variance is referred to as "Between Sum of<br />

Squares" (or BSS). This is calculated by adding up (for all the 3 groups), the<br />

difference between the group's mean and the overall population mean (4.00),<br />

multiplied by the number of cases (i.e. n) in each group.<br />

Between Sum of Squares (BSS)<br />

= No. of students x (Mean of Group 1 – Overall Mean)² + No.<br />

of students x (Mean of Group 2 – Overall Mean) ² + No. of<br />

students x (Mean of Group 3 – Overall Mean) ²<br />

= 313 (4.12 – 4.00)² + 297 (4.37 – 4.00)² + 340 (3.99 –<br />

4.00)²<br />

= 4.51 + 40.66 + 0.034 = 45.21<br />

Degrees of freedom:<br />

This sum of squares has a number of degrees of freedom equal to the number of<br />

groups minus 1. In this case, dfB = (3-1) = 2<br />

Step 2:<br />

Computation of the Between Mean Squares (BMS)<br />

BSS 45.21<br />

Between Mean Squares = = = 22.61<br />

df 2<br />

Divide the BSS figure (45.21) by the number of degrees of freedom (2) to get our<br />

estimate of the variation between groups, referred to as "Between Mean Squares".


Chapter 6: Analysis of Variance<br />

9<br />

Step 3:<br />

Computation of the Within Sum of Squares (WSS)<br />

To measure the variation within groups, we find the sum of the squared deviation<br />

between scores on the Torrance Creative Test and the group average, calculating<br />

separate measures for each group, then summing the group values. This is a sum<br />

referred to as the "Within Sum of Squares" (or WSS).<br />

WSS = ( n 1 – 1) SD1² + ( n 2 – 1) SD2² + ( n 3 – 1 ) SD3²<br />

Within Sum of Squares (WSS)<br />

= (Degrees of Freedom of Group 1 minus 1) x SD² + (Degrees<br />

of Freedom of Group 2 minus 1) x SD² + (Degrees of Freedom<br />

of Group 3 minus 1) x SD²<br />

= (313 – 1) 1.28² + (297 – 1) 1.30² + (340 – 1) 1.31²<br />

= 511.18 + 500.24 + 581.76<br />

= 1070.18<br />

Degrees of freedom:<br />

As in Step 1, we need to adjust the WSS to transform it into an estimate of population<br />

variance, an adjustment that involves a value for the number of degrees of freedom<br />

within. To calculate this, we take a value equal to the number of cases in the total<br />

sample (N = 950), minus the number of groups (k = 3), i.e. 950 - 3 = 947<br />

Step 4:<br />

Computation of the Within Mean Squares (WMS)<br />

Divide the WSS figure (1071.18) by the degrees of freedom (N - k = 947) to get an<br />

estimate of the variation within groups referred to as "Within Mean Squares"<br />

WSS 1071.18<br />

Within Mean Squares = = = 1.13<br />

(WMS) df 947


Chapter 6: Analysis of Variance<br />

10<br />

Step 5:<br />

Computation of the F test statistic<br />

This calculation is relatively straightforward. Simply divide the Between Mean<br />

Squares (BMS), the value obtained in step 1, by the Within Mean Squares (WMS),<br />

the value calculated in step 2.<br />

Between Mean Square 22.61<br />

F = = = 20.0<br />

Wtihin Mean Squares 1.13<br />

Step 6:<br />

To Reject or Not Reject the Hypothesis<br />

To determine if the F statistics is sufficiently large to Reject the null hypothesis, you<br />

have to determine the critical value for the F statistics by referring to the F<br />

distribution. There are two degrees of freedom:<br />

k -1 which is numerator [i.e. 3 group minus 1 = 3- 1 = 2]<br />

N -k which is the denominator [i.e. no. of subjects minus number of groups =<br />

950 - 3 = 947<br />

The critical value is 3.070 which is 2 df by 120 df (the distribution provided<br />

in most textbooks has a maximum of 120 df. You use it for any denominator<br />

exceeding 120 df).<br />

Extract from Table of Critical Values for the F Distribution<br />

df2<br />

df1 1 2 3 4<br />

96 3.940 3.091 2.699 2.466<br />

97 3.939 3.090 2.698 2.465<br />

98 3.938 3.089 2.697 2.465<br />

99 3.937 3.088 2.696 2.464<br />

100 3.936 3.087 2.696 2.463<br />

120 3.920 3.070 2.680 2.450


Chapter 6: Analysis of Variance<br />

11<br />

Finally, compare the F-statistics (20.0) with the critical value 3.07. At at p = 0.05. the<br />

F-statistics is larger (>) than the critical value and hence there is strong evidence to<br />

Reject the null hypothesis, indicating that there is a significant difference in creativity<br />

among the three groups of students. While the F-statistic assesses the null hypothesis<br />

of equal means, it does not address the question of which means are different. For<br />

example, all three groups may be different significantly, or two may be equal but<br />

differ from the third. To establish which of the 3 Groups are different, you have to<br />

follow up with post hoc comparison or tests.<br />

Step 7:<br />

Post Hoc Comparisons or Tests<br />

There are many techniques available for post hoc comparisons and they are as<br />

follows:<br />

Least square difference (LSD)<br />

Duncan<br />

Dunnett<br />

Tukey's honest square difference (HSD)<br />

Scheffe<br />

Tukey's HSD<br />

Mean1 Mean2<br />

Mean1<br />

Mean2<br />

Mean3 *<br />

Mean3<br />

Tukey HSD<br />

The Tukey's HSD runs a series of Tukey's post hoc tests, which are like a series of t-<br />

tests. However, the post-hoc tests are more stringent than the regular t-tests. It<br />

indicates how large an observed difference must be for the multiple comparison<br />

procedure to call it significant. Any absolute difference between means has to exceed<br />

the value of HSD to be statistically significant.<br />

Most statistical programmes will give you an output in the form of a table as<br />

shown on the right. Group means listed as a matrix. An asterik (*) indicates which<br />

pairs of means are significantly different.<br />

Note that the only the mean of Group 3 is significantly different from Group 1.<br />

In other words, High SES (M = 4.12) subject scored significantly higher on creativity<br />

than Low SES (M = 3.85) subjects. There was no significant difference between High<br />

SES and Middle SES subjects nor was there a significant difference between Middle<br />

SES and Low SES subjects.


Chapter 6: Analysis of Variance<br />

12<br />

Assumptions for Using the ONEWAY <strong>ANOVA</strong><br />

Just like all statistical tools, there are certain assumptions that have to be met<br />

for their usage. The following are several assumptions for using the One-Way<br />

<strong>ANOVA</strong>:<br />

A) Independent Observations or Subject:<br />

Are the observations in each of the groups independent ? This means that the<br />

data must be independent. On other words, a particular subject should belong to only<br />

group. If there are three groups, they should be made up of separate individuals so<br />

that the data are truly independent.<br />

If the same subject belongs to the same group and tested twice such as in the<br />

case of a pretest and posttest design, you should instead use the Repeated Measure<br />

One-Way <strong>ANOVA</strong> [Not discussed in this course].<br />

B) Simple Random Samples:<br />

The samples taken from the population under consideration are randomly selected<br />

(Refer to Chapter 1 for random selection techniques).<br />

C) Normal Populations:<br />

For each population, the variable under consideration is normally distributed [Refer to<br />

Chapter 2 for techniques to determine normality of distribution). In other words, to<br />

use the OneWay <strong>ANOVA</strong> you have to ensure that that the distributions for each of the<br />

groups are normal. The analysis of variance is robust if each of the distributions are<br />

symmetric or if all the distributions are skewed in the same direction. This assumption<br />

can be tested by running several normality tests.<br />

1. Normality Tests Using Skewness<br />

Independent variable group Statistic Std. Error<br />

Mean 43.82 2.20<br />

Group 1 Skewness .973 .491<br />

Kurtosis .341 .953<br />

Mean 60.14 2.71<br />

Group 2 Skewness -.235 .597<br />

Kurtosis -1.066 1.154<br />

Mean 64.75 3.61<br />

Group 3 Skewness -.407 .564<br />

Kurtosis -1.289 1.091


Chapter 6: Analysis of Variance<br />

13<br />

Refer to the Table above which show the means, skewness and kurtosis for three<br />

groups. The skewness and kurtosis scores indicate that the scores in Group 1 and<br />

Group 2 are normally distributed. There is some positive skewness in Group 1<br />

2) Normality Tests Using Kolmogorov-Smirnov Statistic<br />

Tests of Normality<br />

Kolmogorov-Smirnov (a)<br />

Shapiro-Wilks<br />

Independent variable Statistic df Sig. Statistic df Sig.<br />

group<br />

Group 1 .206 22 .016 .912 22 .055<br />

Group 2 .166 14 .200(*) .940 14 .442<br />

Group 3 .151 16 .200(*) .900 16 .084<br />

(*) This is a lower bound of the true significance.<br />

a. Lilliefors Significance Correction<br />

The Shapiro-Wilks normality tests indicate that the scores are normally distributed in<br />

each of the three conditions. The Kolmogorov-Statistic is significant, but that statistic<br />

is more appropriate for larger sample sizes.<br />

D) Homogeneity of Variance:<br />

Test of Homogeneity of Variances<br />

Dependent variable<br />

Levene df1 df2 Sig.<br />

Statistics<br />

2.284 2 49 113<br />

Just like you did with the t-test, the Levene's test of homogeneity of variance is used<br />

for the OneWay <strong>ANOVA</strong> and is show in the Table below. The p-value which is 0.113<br />

is greater than the alpha of 0.05. Hence, it can be concluded that the variances are<br />

homogeneous which is reported as Levene (2, 49) = 2.28, p = .113.


Chapter 6: Analysis of Variance<br />

14<br />

LEARNING ACTIVITY<br />

a) When would you use Oneway <strong>ANOVA</strong> and not the t-test<br />

to compare means?<br />

b) What are the assumptions that must be met when using<br />

<strong>ANOVA</strong>?<br />

EXAMPLE: Using SPSS to Compute Oneway <strong>ANOVA</strong><br />

In the COPs study in 2006, a team of researchers administered an Inductive<br />

Reasoning Test to a sample of 946 eighteen year old Malaysians. One of the<br />

independent variables examined was socio-economics status (SES). There were<br />

FOUR SES groups: very high SES, high SES, middle SES and low SES. Researchers<br />

were interested in answering this research question:<br />

Research Question:<br />

Is there a significant difference in inductive reasoning ability between adolescents of<br />

different socio-economic status?<br />

Null Hypothesis: Ho: μ1 = μ2 = μ3 = μ4<br />

Alternative Hypothesis: Ho: µ1 ≠ µ2 ≠ µ3 ≠ µ4


Chapter 6: Analysis of Variance<br />

15<br />

Procedure for the Oneway <strong>ANOVA</strong> with post-hoc anaylsis Using SPSS<br />

1. Select the Analyze menu.<br />

2. Click on Compare Means and One-Way <strong>ANOVA</strong> ..... to open the One-<br />

Way NOVA dialogue box.<br />

3. Select the Dependent variable (i.e. inductive reasoning) and click on the<br />

button to move the variable into the Dependent List: box.<br />

4. Select the Independent variable (i.e. SES) and click on the button to<br />

move the variable into the Factor: box.<br />

5. Click on the Options .....command pushbutton to open the One-Way<br />

<strong>ANOVA</strong>: Options sub-dialogue box.<br />

6. Click on the check boxes for Descriptive and Homogeneity-of-variance.<br />

7. Click on Continue.<br />

8. Click on the Post Hoc .... command pushbutton to open the One-Way<br />

<strong>ANOVA</strong>: Post Hoc Multiple Comparisons sub-dialogue box. You will<br />

notice that a number of multiple comparison options are available. In this<br />

example you will use the Tukey's HSD multiple comparison test.<br />

9. Click on the check box for Tukey.<br />

10. Click on Continue and then OK.<br />

#1. Testing for Homogeneity of Variance<br />

Before you conduct the Oneway <strong>ANOVA</strong>, you have to make sure that your data meet<br />

the relevant assumptions of using Oneway <strong>ANOVA</strong>. Let’s first look at the test of<br />

homogeneity of variances, since satisfying this assumption is necessary for<br />

interpreting <strong>ANOVA</strong> results.<br />

Levene’s test for homogeneity of variances assesses whether the population variances<br />

for the groups are significantly different from each other. The null hypothesis states<br />

that the population variances are equal:<br />

Ho: μ1 = μ2 = μ3 = μ4<br />

Levene's Test for Homogeneity of Variance<br />

F df1 df2 Sign<br />

.383 3 942 .765<br />

The table above is the SPSS output for the Levene's test. Note that the Levene F<br />

statistic has a value of 0.383 and a p-value of 0.756. Since p is greater than α = 0.05


Chapter 6: Analysis of Variance<br />

16<br />

(i.e. 0.756 > 0.05); we Do Not Reject the null hypothesis. Hence, we can conclude<br />

that the data does not violate the homogeneity-of-variance assumption.<br />

#2. Means and Standard Deviations<br />

Another SPSS output is the "Descriptives" table which presents the means and<br />

standard deviations of each group (see table). You will notice that the means are not<br />

all the same. This relatively simple conclusion, however, actually raises more<br />

questions. See if you can answer these questions:<br />

INDUCTIVE<br />

DESCRIPTIVES<br />

N Mean Std. Deviation Std. Error<br />

Low SES 258 8.0194 3.4228 .2131<br />

Middle SES 304 8.4967 3.2680 .1874<br />

High SES 214 8.7850 3.3735 .2306<br />

Very High SES 170 8.4989 3.3382 .1085<br />

Is „middle SES‟ (M = 8.49) different from „low SES‟ (M = 8.01)?<br />

Is “high SES” (M = 8.78) different from “middle SES” (M = 8.49)?<br />

Is “very high SES” (M = 8.87) different from “low SES” (M = 8.01)?<br />

As you may have realised, just by looking at the ‘Descriptives’ table, the group means<br />

cannot tell us decisively if significant differences exist. What is the next step?<br />

#3. Significant Differences<br />

Having concluded that the assumption of homogeneity of variance has been, the<br />

means and standard deviations of each of the four groups have been computed; the<br />

next step is to determine whether SES influences inductive reasoning. You are<br />

seeking to establish whether the four means are 'equal'.


Chapter 6: Analysis of Variance<br />

17<br />

Sum of squares df Mean squares F Sig.<br />

Between groups 100.34 3 33.445 3.021 .029<br />

Within groups 10430.165 942 11.072<br />

Total 10530.499 945<br />

What does the table above indicate?<br />

<br />

<br />

<br />

<br />

The ‘Between groups’ row shows that the df is 3 (i.e. k - 1 = 4 – 1 = 3) and the<br />

mean square is 33.445.<br />

The ‘Within groups’ row shows that the df is 942 (N - k = 946 – 4 = 942) and<br />

the mean square is 11.072.<br />

If you divide 33.444 by 11.072 you will get the F value of 3.021 which is<br />

significant at 0.029.<br />

Since, 0.029 is < than α = 0.05, we can Reject the Null Hypothesis and accept<br />

the alternative hypothesis. You can conclude that there is a significant<br />

difference in inductive reasoning between the four SES groups. But which<br />

group?<br />

#4. Multiple Comparisons<br />

Having obtained a significant result, you can go further and determine using a posthoc<br />

test, where the signifiance lies. There are many different kinds of post-hoc tests,<br />

that examine which means are different from each other. One commonly used<br />

procedure is Tukey’s Honestly Significant Difference test (HSD). The Tukey test<br />

compares all pairs of group means and the results are shown in the ‘Multiple<br />

Comparisons’ table below.


Chapter 6: Analysis of Variance<br />

18<br />

Dependent Variable: INDUCTIV<br />

Tukey HSD<br />

Mean<br />

Difference (I-J) Std. Error Sig.<br />

1.00 2.00 -.4773 .2817 .326<br />

3.00 -.7657 .3077 .062<br />

4.00 -.8512 .3287 .047<br />

2.00 1.00 -.4773 .2817 .326<br />

3.00 -.2883 .2969 .766<br />

4.00 -.3739 .3187 .644<br />

3.00 1.00 .7657 .3077 .062<br />

2.00 .2883 .2969 .766<br />

4.00 - 8.5542E-02 .3419 .995<br />

4.00 1.00 .8512 .3287 .047<br />

2.00 .3739 .3187 .644<br />

3.00 8.554E-02 .3419 .995<br />

1 = Low SES<br />

2 = Middle SES<br />

3 = High SES<br />

4 = Very High SES<br />

Note that each mean is compared to every other mean thrice so the results are<br />

essentially repeated in the table. Interpreting the table reveals that:<br />

There is a significant difference ONLY between ‘Low SES’ subjects (Mean = 8.01)<br />

and “Very High’ SES subjects (Mean = 8.87) at p = 0.047. i.e. Very High SES scored<br />

significantly higher than "Low SES" at p = 0.047. There are no significant differences<br />

between the other groups.


Chapter 6: Analysis of Variance<br />

19<br />

LEARNING ACTIVITY<br />

A study was conducted to determine the effectiveness of<br />

the collaborative method in teaching primary school<br />

mathematics among pupils of varying ability levels. The<br />

performance of 18 pupils on a mathematics posttest is<br />

presented below:<br />

Low Ability Pupils Middle Ability Pupils High Ability Pupils<br />

45 55 59<br />

58 42 54<br />

61 41 62<br />

59 48 57<br />

49 36 48<br />

63 44 65<br />

Compute a OneWAY Anova and based on the output answer the<br />

following questions:<br />

a) Comment on the mean and standard deviation for the three<br />

groups.<br />

b) Is there a significant difference in mathematics performance<br />

between students of different ability levels?<br />

c) What is the p value?<br />

d) What is the F ratio or F value?<br />

e) Interpret the Tukey HSD


Chapter 6: Analysis of Variance<br />

20<br />

LEARNING ACTIVITY<br />

A researcher conducted a study to assess the the level of<br />

knowledge possessed by university students regarding<br />

the rights and responsibilities as citizens. Students<br />

completed a standardised test. Major for students was<br />

also recorded. Data in terms of percent correct is<br />

recorded below for 32 students. Compute the OneWay<br />

Anova test for the data provided below.<br />

Education Management Social Science Computer Science<br />

62 7 42 80<br />

81 49 52 57<br />

75 63 31 87<br />

58 68 80 64<br />

67 39 22 28<br />

48 79 71 29<br />

26 40 68 62<br />

36 15 76 45<br />

Based on the output, answer the following questions:<br />

1. What is your computed answer?<br />

2. What would be the null hypothesis in this study?<br />

3. What would be the alternate hypothesis?<br />

4. What probability level did you choose and why?<br />

5. What were your degrees of freedom?<br />

6. Is there a significant difference between the four testing<br />

conditions?<br />

7. Interpret your answer.<br />

8. If you have made an error, would it be a Type I or a Type II error?<br />

Explain your answer.<br />

--------000----------

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!