02.01.2013 Views

Physics for Geologists, Second edition

Physics for Geologists, Second edition

Physics for Geologists, Second edition

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Some dangers of mathematical statistics 133<br />

Figure 12.9 contains the results of the first two series of experiments<br />

conducted by Darcy, and the series conducted independently by Ritter, and<br />

reported by Darcy. It is fairly clear that the lines representing the best fit of<br />

the data are good, but not perfect, and that the relationship is indeed linear,<br />

as Darcy's Equation (12.10,12.11) indicates. The best check on linearity is to<br />

do what Reynolds of Reynolds number fame did: plot the logarithms of the<br />

data (natural or base 10) or compute the linear regression of the logarithms,<br />

and the slope of the line will indicate the order of the association. (Zeros<br />

and negative numbers in the data can be eliminated by adding a constant to<br />

all the data.) For linear relationships, y = bxm, the slope m should be 1 and<br />

In y = lnx + In b. If the slope is far from 1, there is little point in proceeding<br />

with the linear regression analysis, but the order of the association will be<br />

evident from the logarithmic data. The logarithms of Darcy's data are plot-<br />

ted in Figure 13.1, and it is clear that although the slopes are not all exactly<br />

1 (marked by the dashed line), they are close to 1 and we can regard Darcy's<br />

equation as being linear over the range of values he measured. You must<br />

never extrapolate beyond the measured range without very good reason.<br />

The data in Table 13.1 were obtained experimentally. Using a pocket cal-<br />

culator, it is found that the linear regression equation is y = 3x - 3.66 and<br />

the correlation coefficient, r, is 0.97. This coefficient is extremely signifi-<br />

cant. Student's t, which is one method of assessing significance, is 14.0 <strong>for</strong><br />

seven degrees of freedom (there are nine pairs of data, but two points will<br />

always lie on a straight line) whereas there is a probability of about 1 per<br />

cent or 0.01 that t will be 3.5 or larger by chance, and perhaps a million to<br />

one that it will be as large as 14 or larger. You might there<strong>for</strong>e have great<br />

confidence that the relationship is linear -and you would be wrong.<br />

There are a few clues. If you subtract the observed values of y from the<br />

values calculated from the regression equation, you will see that there is<br />

a systematic pattern to the errors. If you plot the data and the regression<br />

line, this pattern is evident (Figure 13.2).<br />

Copyright 2002 by Richard E. Chapman<br />

Logarithm of flow in litreslminutes<br />

Figure 13.1 Darcy's data, plotted as logarithms. The slopes are close to 1<br />

(marked by the dashed line) indicating a linear relationship between<br />

the difference of head, Ah, and the discharge, Q.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!