01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Apart from the <strong>in</strong>creased precision <strong>in</strong> the estimation of the population mean,<br />

stratification may be adopted to provide reasonably precise estimates of the<br />

means for each of the strata. This may lead to departures from the optimal<br />

allocation described above, so that none of the sample sizes <strong>in</strong> the separate strata<br />

become too small.<br />

Example 19.1<br />

It is desired to estimate the prevalence of a certa<strong>in</strong> condition (i.e. the proportion of<br />

affected <strong>in</strong>dividuals) <strong>in</strong> a population of 5000 people by tak<strong>in</strong>g a sample of size 100.<br />

Suppose that the prevalence is known to be associated with age and that the population<br />

can be divided <strong>in</strong>to three strata def<strong>in</strong>ed by age, with the follow<strong>in</strong>g numbers of<br />

<strong>in</strong>dividuals:<br />

Stratum, i Age (years) Ni<br />

1 0±14 1200<br />

2 15±44 2200<br />

3 45± 1600<br />

5000<br />

Suppose that the true prevalences, pi, <strong>in</strong> the different strata are as follows:<br />

Stratum, i pi<br />

1 0 02<br />

2 0 08<br />

3 0 15<br />

19.2 The plann<strong>in</strong>g of surveys 653<br />

It is easily verified that the overall prevalence, p (ˆ P Nipi= P Ni), is 0 088. A simple<br />

random sample of size n ˆ 100 would therefore give an estimate p with variance<br />

(0 088)(0 912)/100 ˆ 0 000803.<br />

For stratified sampl<strong>in</strong>g with optimal allocation the sample sizes <strong>in</strong> the strata should be<br />

chosen <strong>in</strong> proportion to Ni ‰<br />

p pi…1 pi†Š. The values of this quantity <strong>in</strong> the three strata are<br />

168, 597 and 571, which are <strong>in</strong> the proportions 0 126, 0 447 and 0 427, respectively. The<br />

optimal sample sizes are, therefore, n1 ˆ 12, n2 ˆ 45 and n3 ˆ 43, or numbers very close to<br />

these (there be<strong>in</strong>g a little doubt about the effect of round<strong>in</strong>g to the nearest <strong>in</strong>teger). Let us<br />

call this allocation A. This depends on the unknown pi, and therefore could hardly be<br />

used <strong>in</strong> practice. If we knew very little about the likely variation <strong>in</strong> the pi, we might choose<br />

the ni / Ni, ignor<strong>in</strong>g the effect of the chang<strong>in</strong>g standard deviation. This would give, for<br />

allocation B, n1 ˆ 24, n2 ˆ 44 and n3 ˆ 32. Thirdly, we might have some idea that the<br />

prevalence (and therefore the standard deviation) <strong>in</strong>creased with age, and therefore adjust<br />

the allocation rather arbitrarily to give, say, n1 ˆ 20, n2 ˆ 40 and n3 ˆ 40 (allocation C).<br />

The estimate of ^p is, <strong>in</strong> each case, given by the formula equivalent to (19.1),

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!