01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>14</strong>-40 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

CAUTION<br />

because the details of producing the confidence intervals are quite technical. 10<br />

The BCa method requires more than 1000 resamples for high accuracy. Use<br />

5000 or more resamples if the accuracy of inference is very important. Tilting<br />

is more efficient, so that 1000 resamples are generally enough. Don’t forget<br />

that even BCa <strong>and</strong> tilting confidence intervals should be used cautiously when<br />

sample sizes are small, because there are not enough data to accurately determine<br />

the necessary corrections for bias <strong>and</strong> skewness.<br />

The 2002 Seattle real estate sales data are strongly skewed (Figure<br />

<strong>14</strong>.6). Figure <strong>14</strong>.17 shows the bootstrap distribution of the<br />

EXAMPLE <strong>14</strong>.10<br />

sample mean x. We see that the skewness persists in the bootstrap distribution <strong>and</strong><br />

therefore in the sampling distribution. Inference based on a normal sampling distribution<br />

is not appropriate.<br />

We generally prefer resistant measures of center such as the median or trimmed<br />

mean for skewed data. Accordingly, in Example <strong>14</strong>.5 (page <strong>14</strong>-18) we bootstrapped<br />

the 25% trimmed mean. However, the mean is easily understood by the public <strong>and</strong> is<br />

needed for some purposes, such as projecting taxes based on total sales value.<br />

The bootstrap t <strong>and</strong> percentile intervals aren’t reliable when the sampling distribution<br />

of the statistic is skewed. Figure <strong>14</strong>.18 shows software output that includes all four<br />

of the confidence intervals we have mentioned, along with the traditional one-sample<br />

t interval. The BCa interval is<br />

(329.3 − 62.2, 329.3 + 127.0) = (267.1, 456.3)<br />

Observed<br />

Mean<br />

200 300 400 500<br />

Resample means, $1000s<br />

FIGURE <strong>14</strong>.17 The bootstrap distribution of the sample means of 5000 resamples from<br />

the data in Table <strong>14</strong>.1, for Example <strong>14</strong>.10. The bootstrap distribution is right-skewed, so we<br />

conclude that the sampling distribution of x is right-skewed as well.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!