01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

espectively. For a 100…1 2a†% <strong>in</strong>terval the limits would be the aBth and<br />

…1 a†Bth largest values. This is known as the percentile method.<br />

It turns out that this k<strong>in</strong>d of <strong>in</strong>terval does not perform all that well: for<br />

example, the coverage probability can stray from the nom<strong>in</strong>al value. A ref<strong>in</strong>ement<br />

of the method, known as the bias-corrected and accelerated method, BCa<br />

for short, has better properties. The method still uses the ordered set of the<br />

g 1 , g 2 , ..., g B but chooses the a1Bth and a2Bth largest values for the limits of the<br />

<strong>in</strong>terval. The values a1 and a2 are def<strong>in</strong>ed as<br />

w ‡ z<br />

a1 ˆ F w ‡<br />

…a†<br />

1 a…w ‡ z…a† †<br />

…1 a† w ‡ z<br />

a2 ˆ F w ‡<br />

1 a…w ‡ z…1 a† †<br />

…10:15†<br />

…10:16†<br />

and here z…j† is determ<strong>in</strong>ed by j ˆ F…z…j† †, where F is the standard normal<br />

distribution function. Note that this differs from the def<strong>in</strong>ition of a quantile of<br />

the standard normal variable, zj, given <strong>in</strong> §4.2: the two are related by<br />

z2j ˆ z…j† , j < 1<br />

2 . The quantity w corrects for bias <strong>in</strong> the distribution of the<br />

statistic. The quantity a, known as the acceleration, is related to the behaviour<br />

of the variance of the statistic: further details can be found <strong>in</strong> Efron and<br />

Tibshirani (1993, Chapters 14 and 22) and Davison and H<strong>in</strong>kley (1997, Chapter<br />

5). If these are both zero, then a1 ˆ a and a2 ˆ 1 a and the BCa method<br />

reduces to the percentile method.<br />

The estimate of w is easily obta<strong>in</strong>ed: F…w† is the proportion of the bootstrap<br />

values of g that are less than the observed value. The estimation of a is potentially<br />

more complicated: for a s<strong>in</strong>gle sample of size n, suppose that g i is the value<br />

of g obta<strong>in</strong>ed from the sample with the ith po<strong>in</strong>t omitted. If g is the mean of<br />

g , g , ..., g 1 2 n and<br />

Mk ˆ Pn<br />

iˆ1<br />

… g g i†<br />

k<br />

then a ˆ M3=…6M 3=2<br />

2 †. For more complicated data structures, Davison and<br />

H<strong>in</strong>kley (1997, Chapter 5) should be consulted.<br />

Example 10.9, cont<strong>in</strong>ued<br />

10.7 The bootstrap and the jackknife 303<br />

A 95% confidence <strong>in</strong>terval for P…X < Y† can be found by generat<strong>in</strong>g 2000 bootstrap<br />

samples and for each one comput<strong>in</strong>g the value of D. The 50th and 1950th largest values<br />

are 0 546 and 0 808 and this def<strong>in</strong>es the 95% percentile bootstrap confidence <strong>in</strong>terval for<br />

P…X < Y†.<br />

The BCa <strong>in</strong>terval can also be computed. The proportion of the 2000 values of D that<br />

are less than the observed value, 0 678, is 0 48, giv<strong>in</strong>g w ˆ 0 0502. Computation of a is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!