Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>14</strong>-16 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />
TRIMMED MEAN<br />
A trimmed mean is the mean of only the center observations in a data<br />
set. In particular, the 25% trimmed mean x 25% ignores the smallest<br />
25% <strong>and</strong> the largest 25% of the observations. It is the mean of the<br />
middle 50% of the observations.<br />
Recall that the median is the mean of the 1 or 2 middle observations. The<br />
trimmed mean often does a better job of representing the average of typical<br />
observations than does the median. Our parameter is the 25% trimmed mean<br />
of the population of all real estate sales prices in Seattle in 2002. By the plug-in<br />
principle, the statistic that estimates this parameter is the 25% trimmed mean<br />
of the sample prices in Table <strong>14</strong>.1. Because 25% of 50 is 12.5, we drop the 12<br />
lowest <strong>and</strong> 12 highest prices in Table <strong>14</strong>.1 <strong>and</strong> find the mean of the remaining<br />
26 prices. The statistic is (in thous<strong>and</strong>s of dollars)<br />
x 25% = 244.0019<br />
We can say little about the sampling distribution of the trimmed mean<br />
when we have only 50 observations from a strongly skewed distribution. Fortunately,<br />
we don’t need any distribution facts to use the bootstrap. We bootstrap<br />
the 25% trimmed mean just as we bootstrapped the sample mean: draw<br />
1000 resamples of size 50 from the 50 selling prices in Table <strong>14</strong>.1, calculate<br />
the 25% trimmed mean for each resample, <strong>and</strong> form the bootstrap distribution<br />
from these 1000 values.<br />
Figure <strong>14</strong>.7 shows the bootstrap distribution of the 25% trimmed mean.<br />
Here is the summary output from S-PLUS:<br />
Number of Replications: 1000<br />
Summary Statistics:<br />
Observed Mean Bias SE<br />
TrimMean 244 244.7 0.7171 16.83<br />
What do we see? Shape: The bootstrap distribution is roughly normal. This<br />
suggests that the sampling distribution of the trimmed mean is also roughly<br />
normal. Center: The bootstrap estimate of bias is 0.7171, small relative to the<br />
value 244 of the statistic. So the statistic (the trimmed mean of the sample)<br />
has small bias as an estimate of the parameter (the trimmed mean of the population).<br />
Spread: The bootstrap st<strong>and</strong>ard error of the statistic is<br />
SE boot = 16.83<br />
This is an estimate of the st<strong>and</strong>ard deviation of the sampling distribution of<br />
the trimmed mean.<br />
Recall the familiar one-sample t confidence interval (page 452) for the<br />
mean of a normal population:<br />
x ± t ∗ SE = x ± t ∗ √ s<br />
n<br />
This interval is based on the normal sampling distribution of the sample mean<br />
x <strong>and</strong> the formula SE = s/ √ n for the st<strong>and</strong>ard error of x. When a bootstrap