12.01.2015 Views

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

REGRESSION ANALYSIS 537<br />

we know or assume values of the other variable(s)’<br />

(Cohen and Holliday 1996: 88). It<br />

is a way of modelling the relationship between<br />

variables. We concern ourselves here with<br />

simple linear regression and multiple regression<br />

(see http://www.routledge.com/textbo<strong>ok</strong>s/<br />

9780415368780 – Chapter 24, file SPSS Manual<br />

24.7).<br />

Simple linear regression<br />

In simple linear regression the model includes<br />

one explanatory variable (the independent variable)<br />

and one explained variable (the dependent<br />

variable) (see http://www.routledge.com/<br />

textbo<strong>ok</strong>s/9780415368780 – Chapter 24, file<br />

24.15.ppt). For example, we may wish to see the<br />

effect of hours of study on levels of achievement in<br />

an examination, to be able to see how much improvement<br />

will be made to an examination mark<br />

by a given number of hours of study. Hours of study<br />

is the independent variable and level of achievement<br />

is the dependent variable. Conventionally,<br />

as in the example in Box 24.29, one places the<br />

independent variable in the vertical axis and the<br />

dependent variable in the horizontal axis. In the<br />

example in Box 24.29, we have taken 50 cases of<br />

hours of study and student performance, and have<br />

constructed a scatterplot to show the distributions<br />

(SPSS performs this function at the click of two<br />

or three keys). We have also constructed a line<br />

of best fit (SPSS will do this easily) to indicate<br />

the relationship between the two variables. The<br />

line of best fit is the closest straight line that can<br />

be constructed to take account of variance in the<br />

scores, and strives to have the same number of<br />

cases above it and below it and making each point<br />

as close to the line as possible; for example, one<br />

can see that some scores are very close to the<br />

line and others are some distance away. There is a<br />

formula for its calculation, but we do not explore<br />

that here.<br />

One can observe that the greater the number<br />

of hours spent in studying, generally the greater<br />

is the level of achievement. This is akin to<br />

correlation. The line of best fit indicates not only<br />

that there is a positive relationship, but also that<br />

Box 24.29<br />

Ascatterplotwiththeregressionline<br />

Hours of study<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

0<br />

20<br />

30<br />

40 50 60<br />

Level of achievement<br />

the relationship is strong (the slope of the line is<br />

quite steep). However, where regression departs<br />

from correlation is that regression provides an<br />

exact prediction of the value – the amount – of<br />

one variable when one knows the value of the<br />

other. One could read off the level of achievement,<br />

for example, if one were to study for two hours (43<br />

marks out of 80) or for four hours (72 marks out<br />

of 80), of course, taking no account of variance.<br />

To help here scatterplots (e.g. in SPSS) can insert<br />

grid lines, for example (Box 24.30).<br />

It is dangerous to predict outside the limits of<br />

the line; simple regression is to be used only to<br />

calculate values within the limits of the actual line,<br />

and not beyond it. One can observe, also, that<br />

though it is possible to construct a straight line<br />

of best fit (SPSS does this automatically), some<br />

of the data points lie close to the line and some<br />

lie a long way from the line; the distance of the<br />

data points from the line is termed the residuals,<br />

and this would have to be commented on in any<br />

analysis (there is a statistical calculation to address<br />

this but we do not go into it here).<br />

Where the line strikes the vertical axis is named<br />

the intercept. Wereturntothislater,butatthis<br />

stage we note that the line does not go through<br />

the origin but starts a little way up the vertical<br />

70<br />

80<br />

Chapter 24

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!