6. SIMPLE LINEAR REGRESSION & CORRELATION
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
SIMPLE LINEAR
REGRESSION
CONTENT
o
o
o
o
o
o
8.1 Introduction
8.2 Linear Correlation and Simple Linear Line
8.3 The Least Square Regression line
8.4 Estimation and Prediction
8.5 Inferences in Correlation
8.6 Hypothesis Testing for the Slope of
Regression Line
OBJECTIVES
o
To see how the data looks like and relate with each
other.
o
To find a mathematical equation that can relate a
dependent and independent variables x and y and use
it to estimate the new y value.
o
To calculate the strength of the linear relationship
between 2 x and y .
8.1 INTRODUCTION
o
Suppose you wish to investigate the
relationship between a dependent variable
(y) and independent variable (x)
n
n
n
Independent variable (x) – the variables has
been controlled
Dependent variable (y) – the response variables
In other word, the value of y depends on the value
of x.
Example A
Suppose you wish to investigate the relationship between
the numbers of hours student’s spent studying for an
examination and the mark they achieved.
Students A B C D E F G H
numbers of hours (x) 5 8 9 10 10 12 13 15
Final marks ( (y) 49 60 55 72 65 80 82 85
Numbers of hours
student’s spent studying for
an examination
( x – Independent
variable )
will cause
the mark (y) they achieved.
( y – Dependent variable )
Other examples
n
The weight at the end of a spring (x) and the
length of the spring (y)
n
A student’s mark in Statistics test (x) and the
mark in a Programming test (y)
n
The diameter of the stem of a plant (x) and the
average length of leaf of the plant (y)
8.2 LINEAR CORRELATION AND
SIMPLE LINEAR LINE
o
When pairs of values are plotted, a scatter diagram
is produced.
scatter plot Yield of hay (y) versus Amount of water (x)
Yield of hay (y)
9
8
7
6
5
4
3
2
1
0
0 20 40 60 80 100 120 140
Amount of water (x)
o
Exercise: Plot a scatter diagram for Example A
8.2 LINEAR CORRELATION AND
SIMPLE LINEAR LINE
o
Linear correlation
n
If the points on the scatter diagram appear to lie near a
straight line ( Simple regression line )
o
Or you would say that there is a linear correlation
between x and y
o Exercise: From the scatter diagram for Example A,
is there any correlation between x and y?
Positive Linear Correlation
Negative Linear Correlation
No Correlation
No relationship between x and y
8.3 THE LEAST SQUARE
REGRESSION LINE
o
a mathematical way of fitting the regression line
o
The line of best fit must pass through the means of both sets
of data, i.e. the point ( x,
y)
scatter plot Yield of hay (y) versus Amount of water (x)
Yield of hay (y)
9
8
7
6
5
4
3
2
1
0
0 20 40 60 80 100 120 140
Amount of water (x)
( x,
y)
Least square regression line of y on x
o Exercise: Find and draw the regression line for Example A,
8.4 ESTIMATION AND PREDICTION
o
The regression line y on x is used,
o
When x is the independent variable and you
want to
n estimate y for a given value of x
n estimate x for a given value of y.
o
When neither variable is controlled and you
want to estimate y for a given value of x
Example
o (Use Example A)
n The estimate of y when x = 10
n the estimate of x when y = 75
8.5 INFERENCES IN CORRELATION
o
The product moment correlation
coefficient, r, is a numerical
value between -1 and 1 inclusive
which indicates the linear degree
of scatter.
n
n
n
- 1£ r £ 1
r = 1 indicates perfect positive
linear correlation
r = -1 indicates perfect negative
linear correlation
r = 0 indicates no correlation
S
S
S
r =
xx
yy
xy
å
2
= x -
å
2
= y -
å
= xy-
S
xx
xy
S S
( å x)
n
yy
2
( å y)
n
x
n
2
åå
y
8.5 INFERENCES IN CORRELATION
o
The nearer the value of r is to 1 or -1, the
closer the points on the scatter diagram are to
the regression line
n
n
Nearer to 1 is strong positive linear correlation
Nearer to -1 is strong negative linear correlation
o
Exercise: Calculate the correlation
coefficient r for Example A
8.6 HYPOTHESIS TESTING FOR THE
SLOPE OF REGRESSION LINE
Ø To test the linear relationship between x and y
Ø x and y have a linear relationship if the slope b ¹ 0
ØTest the hypothesis,
Ho
: b = 0 and H : b ¹ 0
1
with statistic test.
t
b - b
= ~ t n -2
where
Var
( b )
Var
( b )
=
æSyy
- b S ö
xy
ç n 2 ÷
è
-
ø
S
xx
Ø If H o is reject, x and y have a linear relationship
o Exercise: Test the linearity between x and y for Example A at a = 0.05
Regresi Linier menggunakan Excel
Ketik data
Scatter Plot
90
80
70
60
50
40
Series1
30
20
10
0
0 2 4 6 8 10 12 14 16
Klik Data analysis, pilih regression
Masukkan data X dan Y
Output Excel
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.944799501
R Square 0.892646097
Adjusted R Square 0.87475378
Standard Error 4.72163395
Observations 8
ANOVA
df SS MS F Significance F
Regression 1 1112.237037 1112.237 49.88991 0.000403286
Residual 6 133.762963 22.29383
Total 7 1246
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 26.89259259 6.122634851 4.392323 0.004606 11.91104484 41.87414034 11.91104484 41.87414034
X Variable 1 4.059259259 0.574698983 7.063279 0.000403 2.65302151 5.465497009 2.65302151 5.465497009
CONCLUSION
o
This chapter introduces important methods
(regression) for making inferences about a
relationship between two variables and describing
such a relationship with an equation that can be used
for predicting value of one variable given the value
of the other variable.
Thank You