17.01.2015 Views

LibraryPirate

LibraryPirate

LibraryPirate

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

9.8 SOME PRECAUTIONS 455<br />

X Y X Y<br />

5.9 50 12.1 31<br />

6.1 32 14.1 29<br />

7.0 41 15.0 23<br />

8.2 42<br />

9.8 SOME PRECAUTIONS<br />

Regression and correlation analysis are powerful statistical tools when properly<br />

employed. Their inappropriate use, however, can lead only to meaningless results. To<br />

aid in the proper use of these techniques, we make the following suggestions:<br />

1. The assumptions underlying regression and correlation analysis should be reviewed<br />

carefully before the data are collected. Although it is rare to find that assumptions<br />

are met to perfection, practitioners should have some idea about the magnitude of<br />

the gap that exists between the data to be analyzed and the assumptions of the proposed<br />

model, so that they may decide whether they should choose another model;<br />

proceed with the analysis, but use caution in the interpretation of the results; or<br />

use the chosen model with confidence.<br />

2. In simple linear regression and correlation analysis, the two variables of interest are<br />

measured on the same entity, called the unit of association. If we are interested in<br />

the relationship between height and weight, for example, these two measurements<br />

are taken on the same individual. It usually does not make sense to speak of the<br />

correlation, say, between the heights of one group of individuals and the weights of<br />

another group.<br />

3. No matter how strong is the indication of a relationship between two variables, it<br />

should not be interpreted as one of cause and effect. If, for example, a significant<br />

sample correlation coefficient between two variables X and Y is observed, it can<br />

mean one of several things:<br />

a. X causes Y.<br />

b. Y causes X.<br />

c. Some third factor, either directly or indirectly, causes both X and Y.<br />

d. An unlikely event has occurred and a large sample correlation coefficient has<br />

been generated by chance from a population in which X and Y are, in fact,<br />

not correlated.<br />

e. The correlation is purely nonsensical, a situation that may arise when measurements<br />

of X and Y are not taken on a common unit of association.<br />

4. The sample regression equation should not be used to predict or estimate outside<br />

the range of values of the independent variable represented in the sample. As illustrated<br />

in Section 9.5, this practice, called extrapolation, is risky. The true relationship<br />

between two variables, although linear over an interval of the independent<br />

variable, sometimes may be described at best as a curve outside this interval. If

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!