06.08.2013 Views

Introduction to regression

Introduction to regression

Introduction to regression

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

124 Further Mathematics<br />

THINK WRITE/DISPLAY<br />

1<br />

Use a graphics calcula<strong>to</strong>r <strong>to</strong> find the<br />

equation of the least-squares <strong>regression</strong><br />

line.<br />

The value of r may be found using VARS,<br />

5:Statistics, EQ and 7:, and then<br />

or by choosing DiagnosticOn in the<br />

CATALOG.<br />

Interpolation and extrapolation<br />

As we have already observed, any linear <strong>regression</strong> method produces a linear equation<br />

in the form:<br />

y = (gradient) × x + (y-intercept)<br />

or y = m x + b<br />

This line can be used <strong>to</strong> ‘predict’ data values for a given value of x. Of course, these<br />

are only approximations, since the <strong>regression</strong> line itself is only an estimate of the ‘true’<br />

relationship between the bivariate data. However, they can still be used, in some cases,<br />

<strong>to</strong> provide additional information about the data set (that is, make predictions).<br />

There are two types of prediction: interpolation and extrapolation.<br />

Interpolation<br />

ENTER<br />

y = 206.25x + 202.5<br />

Note: r indicates a strong positive linear association and from r 2 we see that 99% of the<br />

variation in the number of bacteria can be explained by the number of days over which the<br />

experiment ran. This means that this data set provides a very good predic<strong>to</strong>r of the number<br />

of bacteria present.<br />

2<br />

Interpret the results.<br />

The rate at which bacteria are growing is<br />

defined by the gradient of the least-squares<br />

<strong>regression</strong>.<br />

The number of bacteria at the start of the<br />

experiment is denoted by the y-intercept of the<br />

least-squares <strong>regression</strong> line.<br />

As can be observed from the graph, the<br />

number of bacteria when time = 0 (about 200)<br />

can be seen as the y-intercept of the graph,<br />

and the daily rate of increase (about 200) is<br />

the gradient of the straight line.<br />

m is 206.25, hence the bacteria are<br />

growing by a rate of approximately<br />

206 per day.<br />

y-intercept is 202.5, hence the initial<br />

number of bacteria present was<br />

approximately 202.<br />

Interpolation is the use of the <strong>regression</strong> line <strong>to</strong> predict values ‘in between’ two values<br />

already in the data set. If the data are highly linear (r near +1 or −1) then we can be<br />

confident that our interpolated value is quite accurate. If the data are not highly linear<br />

(r near 0) then our confidence is duly reduced. For example, medical information

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!