Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>14</strong>-40 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />
CAUTION<br />
because the details of producing the confidence intervals are quite technical. 10<br />
The BCa method requires more than 1000 resamples for high accuracy. Use<br />
5000 or more resamples if the accuracy of inference is very important. Tilting<br />
is more efficient, so that 1000 resamples are generally enough. Don’t forget<br />
that even BCa <strong>and</strong> tilting confidence intervals should be used cautiously when<br />
sample sizes are small, because there are not enough data to accurately determine<br />
the necessary corrections for bias <strong>and</strong> skewness.<br />
The 2002 Seattle real estate sales data are strongly skewed (Figure<br />
<strong>14</strong>.6). Figure <strong>14</strong>.17 shows the bootstrap distribution of the<br />
EXAMPLE <strong>14</strong>.10<br />
sample mean x. We see that the skewness persists in the bootstrap distribution <strong>and</strong><br />
therefore in the sampling distribution. Inference based on a normal sampling distribution<br />
is not appropriate.<br />
We generally prefer resistant measures of center such as the median or trimmed<br />
mean for skewed data. Accordingly, in Example <strong>14</strong>.5 (page <strong>14</strong>-18) we bootstrapped<br />
the 25% trimmed mean. However, the mean is easily understood by the public <strong>and</strong> is<br />
needed for some purposes, such as projecting taxes based on total sales value.<br />
The bootstrap t <strong>and</strong> percentile intervals aren’t reliable when the sampling distribution<br />
of the statistic is skewed. Figure <strong>14</strong>.18 shows software output that includes all four<br />
of the confidence intervals we have mentioned, along with the traditional one-sample<br />
t interval. The BCa interval is<br />
(329.3 − 62.2, 329.3 + 127.0) = (267.1, 456.3)<br />
Observed<br />
Mean<br />
200 300 400 500<br />
Resample means, $1000s<br />
FIGURE <strong>14</strong>.17 The bootstrap distribution of the sample means of 5000 resamples from<br />
the data in Table <strong>14</strong>.1, for Example <strong>14</strong>.10. The bootstrap distribution is right-skewed, so we<br />
conclude that the sampling distribution of x is right-skewed as well.