01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

314 Modell<strong>in</strong>g cont<strong>in</strong>uous data<br />

Example 11.1<br />

The analysis of variance of y, from the data of Example 7.1 (pp. 192 and 199), is as<br />

follows:<br />

SSq DF MSq VR<br />

Due to regression 7 666 39 1 7666 39 24 2 (P < 0 001)<br />

About regression 9 502 08 30 316 74 1 00<br />

Total 17 168 47 31<br />

The SSq have already been obta<strong>in</strong>ed on pp. 192 and 199. The value of t obta<strong>in</strong>ed<br />

previously was 4 92; note that …4 92† 2 ˆ 24 2, the value of F.<br />

Test of l<strong>in</strong>earity<br />

It is often important to know not only whether the slope of an assumed l<strong>in</strong>ear<br />

regression is significant, but also whether there is any reason to doubt the basic<br />

assumption of the l<strong>in</strong>earity of the regression.<br />

If the data provide a number of replicate read<strong>in</strong>gs of y for certa<strong>in</strong> values of x,<br />

a test of l<strong>in</strong>earity is easily obta<strong>in</strong>ed. Suppose that, at the value xi of x, there are ni<br />

observations on y, with a mean yi. Each such group of replicates is called an<br />

array. Figure 11.2 illustrates three different situations. In (a), a l<strong>in</strong>ear regression<br />

seems to be consistent with the observed data <strong>in</strong> that the array means yi are<br />

reasonably close to the regression l<strong>in</strong>e. In (b) and (c), however, the array means<br />

deviate from the l<strong>in</strong>e by more than can easily be expla<strong>in</strong>ed by the with<strong>in</strong>-arrays<br />

variation. In (b) the deviations seem to be systematic, suggest<strong>in</strong>g that a curved<br />

regression l<strong>in</strong>e is required. In (c) the deviations seem to lack any pattern,<br />

suggest<strong>in</strong>g perhaps an extra source of variation associated with each array; for<br />

example, if each array referred to observations on animals <strong>in</strong> a s<strong>in</strong>gle cage, the<br />

position<strong>in</strong>g of the cage <strong>in</strong> the laboratory might affect the whole array.<br />

In discuss<strong>in</strong>g Fig. 11.2 we have made a rough comparison between the<br />

magnitude of deviations of array means from the regression l<strong>in</strong>e and the<br />

with<strong>in</strong>-arrays variation. The comparison is made formally as follows. For any<br />

value y <strong>in</strong> the array correspond<strong>in</strong>g to xi, the residual y Yi may be divided <strong>in</strong>to<br />

two parts:<br />

y Yi ˆ…y yi†‡…yi Yi†: …11:5†<br />

When both sides are squared and summed over all observations, the sum of<br />

products of the two terms on the right vanishes, and we have a partition of the<br />

Residual SSq:<br />

P<br />

…y Yi† 2 ˆ P …y yi† 2 ‡ P …yi Yi† 2 , …11:6†

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!