01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>14</strong>-20 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

23 requests from customers of a CLEC during the same time period. The distributions<br />

are both far from normal. Here are some summary statistics:<br />

Service provider n ¯x s<br />

Verizon 1664 8.4 <strong>14</strong>.7<br />

CLEC 23 16.5 19.5<br />

Difference −8.1<br />

The data suggest that repair times may be longer for CLEC customers. The mean<br />

repair time, for example, is almost twice as long for CLEC customers as for Verizon<br />

customers.<br />

In the setting of Example <strong>14</strong>.6 we want to estimate the difference of population<br />

means, µ 1 − µ 2 . We are reluctant to use the two-sample t confidence<br />

interval because one of the samples is both small <strong>and</strong> very skewed. To compute<br />

the bootstrap st<strong>and</strong>ard error for the difference in sample means x 1 − x 2 ,<br />

resample separately from the two samples. Each of our 1000 resamples consists<br />

of two group resamples, one of size 1664 drawn with replacement from<br />

the Verizon data <strong>and</strong> one of size 23 drawn with replacement from the CLEC<br />

data. For each combined resample, compute the statistic x 1 − x 2 . The 1000<br />

differences form the bootstrap distribution. The bootstrap st<strong>and</strong>ard error is<br />

the st<strong>and</strong>ard deviation of the bootstrap distribution.<br />

S-PLUS automates the proper bootstrap procedure. Here is some of the<br />

S-PLUS output:<br />

Number of Replications: 1000<br />

Summary Statistics:<br />

Observed Mean Bias SE<br />

meanDiff -8.098 -8.251 -0.1534 4.052<br />

CAUTION<br />

Figure <strong>14</strong>.9 shows that the bootstrap distribution is not close to normal. It<br />

has a short right tail <strong>and</strong> a long left tail, so that it is skewed to the left. Because<br />

the bootstrap distribution is nonnormal, we can’t trust the bootstrap t confidence<br />

interval. When the sampling distribution is nonnormal, no method based on<br />

normality is safe. Fortunately, there are more general ways of using the bootstrap<br />

to get confidence intervals that can be safely applied when the bootstrap<br />

distribution is not normal. These methods, which we discuss in Section <strong>14</strong>.4,<br />

are the next step in practical use of the bootstrap.<br />

BEYOND THE BASICS<br />

The bootstrap for a<br />

scatterplot smoother<br />

The bootstrap idea can be applied to quite complicated statistical methods,<br />

such as the scatterplot smoother illustrated in <strong>Chapter</strong> 2 (page 110).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!