25.11.2014 Views

Biostatistics

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

9.4 EVALUATING THE REGRESSION EQUATION 429<br />

Y<br />

(a)<br />

X<br />

Y<br />

(b)<br />

FIGURE 9.4.2 Population conditions relative to X and Y that may cause rejection of the<br />

null hypothesis that b 1 ¼ 0. (a) The relationship between X and Y is linear and of sufficient<br />

strength to justify the use of a sample regression equation to predict and estimate Y for<br />

given values of X. (b) A linear model provides a good fit to the data, but some curvilinear<br />

model would provide an even better fit.<br />

X<br />

Unexplained Deviation Finally, we measure the vertical distance of the<br />

observed point from the regression line to obtain ðy i ^y i Þ, which is called the<br />

unexplained deviation, since it represents the portion of the total deviation not<br />

“explained” or accounted for by the introduction of the regression line. These three<br />

quantities are shown for a typical value of Y in Figure 9.4.4. The difference between the<br />

observed value of Yand the predicted value of Y, ðy i ^y i Þ,isalsoreferredtoasaresidual.<br />

The set of residuals can be used to test the underlying linearity and equal-variances<br />

assumptions of the regression model described in Section 9.2. This procedure is<br />

illustrated at the end of this section.<br />

It is seen, then, that the total deviation for a particular y i is equal to the sum of the<br />

explained and unexplained deviations. We may write this symbolically as<br />

ð<br />

y i<br />

y Þ<br />

¼ ð^y i y Þ<br />

þ ðy i ^y i Þ<br />

(9.4.1)<br />

total<br />

deviation<br />

explained<br />

deviation<br />

unexplained<br />

deviation

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!