01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

192 Regression and correlation<br />

on substitut<strong>in</strong>g for b from (7.3).<br />

F<strong>in</strong>ally, it can be shown that an unbiased estimator of s2 is<br />

s 2 0 ˆ<br />

P 2<br />

…y Y†<br />

, …7:6†<br />

n 2<br />

the residual sum of squares, P …y Y† 2 , be<strong>in</strong>g obta<strong>in</strong>able from (7.5). The divisor<br />

n 2 is often referred to as the residual degrees of freedom, s2 0 as the residual<br />

mean square, and s0 as the standard deviation about regression.<br />

The quantities a and b are called the regression coefficients; the term is often<br />

used particularly for b, the slope of the regression l<strong>in</strong>e.<br />

The expression <strong>in</strong> the numerator of (7.3) is the sum of products of deviations of x and y<br />

about their means. A short-cut formula analogous to (2.3) is useful for computational<br />

work:<br />

P …xi x†…yi y† ˆ P xiyi<br />

… P xi†… P yi†<br />

: …7:7†<br />

n<br />

Note that, whereas a sum of squares about the mean must be positive or zero, a<br />

sum of products of deviations about the mean may be negative, <strong>in</strong> which case,<br />

from (7.3), b will also be negative.<br />

The above theory is illustrated <strong>in</strong> the follow<strong>in</strong>g example, which will also be<br />

used later <strong>in</strong> the chapter after further po<strong>in</strong>ts have been considered. Although the<br />

calculations necessary <strong>in</strong> a simple l<strong>in</strong>ear regression are feasible us<strong>in</strong>g a scientific<br />

calculator, one would usually use either a statistical package on a computer or a<br />

calculator with keys for fitt<strong>in</strong>g a regression, and the actual calculations would<br />

not be a concern.<br />

Example 7.1<br />

Table 7.1 gives the values for 32 babies of x, the birth weight, and y, the <strong>in</strong>crease <strong>in</strong> weight<br />

between the 70th and 100th day of life expressed as a percentage of the birth weight. A<br />

scatter diagram is shown <strong>in</strong> Fig. 7.4 which suggests an association between the two<br />

variables <strong>in</strong> a negative direction. This seems quite plausible: when the birth weight is<br />

low the subsequent rate of growth, relative to the birth weight, would be expected to be<br />

high, and vice versa. The trend seems reasonably l<strong>in</strong>ear.<br />

From Table 7.1 we proceed as follows:<br />

n ˆ 32 P P<br />

x ˆ 3576 y ˆ 2281<br />

x ˆ 3576=32 y ˆ 2281=32<br />

P 2 x ˆ 409 880<br />

ˆ 111 75: ˆ 71 28:<br />

P<br />

xy ˆ 246 032<br />

P 2 y ˆ 179 761<br />

2<br />

P<br />

x =n ˆ 399 618 00<br />

P<br />

x<br />

P<br />

y =n ˆ 254 901 75<br />

2<br />

P<br />

y =n ˆ 162 592 53<br />

P<br />

…x<br />

2<br />

x† ˆ 10 262 00<br />

P<br />

…x x†…y y† ˆ 8869 75<br />

P<br />

…y<br />

2<br />

y† ˆ 17 168 47:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!