01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

sample of permutations. If the null hypothesis is true, then the number of<br />

permutations that will result <strong>in</strong> a statistic that is more extreme than the observed<br />

value will have a b<strong>in</strong>omial distribution with parameters NR and Pperm. Con-<br />

sequently there is, approximately, a 95% probability that PR p<br />

lies with<strong>in</strong><br />

2 ‰fPperm…1 Pperm†g=NRŠ ofPperm.<br />

As there is usually greater <strong>in</strong>terest <strong>in</strong> smaller<br />

significance levels, it is sensible to try to ensure that smaller values of Pperm are<br />

estimated with greater absolute precision. Sensible values of NR might be found<br />

by requir<strong>in</strong>g the coefficient of variation of PR, which is<br />

p<br />

10.6Permutation and Monte Carlo tests 297<br />

…1 Pperm†=…NRPperm†<br />

, …10:14†<br />

to be less than some prescribed value, such as 0 05 or 0 1. To obta<strong>in</strong> a coefficient<br />

of variation of 0 1 when Pperm ˆ 0 5 requires NR ˆ 100, but this rises to<br />

NR ˆ 1900 when Pperm ˆ 0 05. This is <strong>in</strong>tuitively reasonable: many more simulations<br />

will be required to determ<strong>in</strong>e, with given relative precision, the probability<br />

of a rare event than the probability of a more common event.<br />

Determ<strong>in</strong><strong>in</strong>g NR from (10 14) requires an estimate of Pperm: a conservative<br />

approach would be to assume a small value and accept that if this is untrue some<br />

unnecessary samples will be drawn. In many applications several thousand<br />

permutations can be sampled very rapidly, so this is often acceptable. An alternative,<br />

which can be very useful when evaluation of each statistic is timeconsum<strong>in</strong>g,<br />

is to adopt a sequential approach. The essential idea is that if the<br />

observed statistic is not extreme then it is likely that a high proportion of the first<br />

few random permutations will result <strong>in</strong> values of the statistic that are more<br />

extreme than Gobs. For example, if 40 of the first 100 random permutations<br />

gave more extreme statistics than the observed value, then it would be clear that<br />

the test will not reject the null hypothesis. There would be no po<strong>in</strong>t <strong>in</strong><br />

cont<strong>in</strong>u<strong>in</strong>g to sample permutations <strong>in</strong> order to obta<strong>in</strong> a more precise estimate<br />

of a value which holds little further <strong>in</strong>terest. On the other hand, if the observed<br />

statistic is extreme, then it will be worthwhile cont<strong>in</strong>u<strong>in</strong>g sampl<strong>in</strong>g until a<br />

sufficiently precise estimate of Pperm is obta<strong>in</strong>ed. These <strong>in</strong>formal ideas are put<br />

on a firm foundation by Besag and Clifford (1991), who suggest that sampl<strong>in</strong>g<br />

cont<strong>in</strong>ues until h values of the statistic more extreme than the observed value<br />

are obta<strong>in</strong>ed. If the number of samples needed to achieve this is l then the<br />

estimate of Pperm is h=l. These authors suggest values for h of 10 or 20, although<br />

smaller values could be used if the sampl<strong>in</strong>g and evaluations were particularly<br />

onerous.<br />

Monte Carlo significance levels are necessarily discrete: non-sequential values<br />

be<strong>in</strong>g restricted to the set 0=NR,1=NR, ..., …NR 1†=NR, 1. However, for practical<br />

purposes NR will generally be sufficiently large to ensure that this is of no<br />

practical importance. An advantage of the sequential approach is that<br />

significance levels obta<strong>in</strong>ed <strong>in</strong> this way are restricted to the set 1,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!