01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

100 Analys<strong>in</strong>g means and proportions<br />

x 2 58s= n<br />

p :<br />

In general, the 1 a confidence limits are<br />

x zas= n<br />

p ,<br />

where za is the standardized normal deviate exceeded (<strong>in</strong> either direction) with<br />

probability a. (The notation here is not universally standard: <strong>in</strong> some usages the<br />

subscript a refers to either the one-tailed probability, which we write as 1<br />

2 a,or<br />

1<br />

the distribution function, 1 2 a.)<br />

Unknown s: the t distribution<br />

Suppose now that we wish to test a null hypothesis which specifies the mean<br />

value of a normal distribution (m ˆ m0) but does not specify the variance s2 , and<br />

that we have no evidence about s2 besides that conta<strong>in</strong>ed <strong>in</strong> our sample. The<br />

procedure outl<strong>in</strong>ed above cannot be followed because the standard error of<br />

the mean, s= n<br />

p , cannot be calculated. It seems reasonable to replace s by the<br />

estimated standard deviation <strong>in</strong> the sample, s, giv<strong>in</strong>g a standardized deviate<br />

t ˆ x m0 p …4:6†<br />

s= n<br />

<strong>in</strong>stead of the normal deviate z given by (4.3). The statistic t would be expected to<br />

follow a sampl<strong>in</strong>g distribution close to that of z (i.e. close to a standard normal<br />

distribution with mean 0 and variance 1) when n is large, because then s will be a<br />

good approximation to s. When n is small, s may differ considerably from s,<br />

purely by chance, and this will cause t to have substantially greater random<br />

variability than z.<br />

In fact, t follows what is known as the t distribution on n 1 degrees of<br />

freedom (DF). The t distributions form a family, dist<strong>in</strong>guished by an <strong>in</strong>dex, the<br />

`degrees of freedom', which <strong>in</strong> the present application is one less than the sample<br />

size. As the degrees of freedom <strong>in</strong>crease, the t distribution tends towards the<br />

standard normal distribution (Fig. 4.7). AppendixTable A3 shows the percentiles<br />

of t, i.e. the values exceeded with specified probabilities, for different values<br />

of the degrees of freedom, n. For n ˆ1, the tabulated values agree with those of<br />

the standard normal distribution. The 5% po<strong>in</strong>t, which always exceeds the<br />

normal value of 1 960, is nevertheless close to 2 0 for all except quite small values<br />

of n. The t distribution was derived by W.S. Gosset (1876±1937) and published<br />

under the pseudonym of `Student' <strong>in</strong> 1908; the distribution is frequently referred<br />

to as Student's t distribution.<br />

The t distribution is strictly valid only if the distribution of x is normal.<br />

Nevertheless, it is reasonably robust <strong>in</strong> the sense that it is approximately valid for<br />

quite marked departures from normality.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!