01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>14</strong>-16 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

TRIMMED MEAN<br />

A trimmed mean is the mean of only the center observations in a data<br />

set. In particular, the 25% trimmed mean x 25% ignores the smallest<br />

25% <strong>and</strong> the largest 25% of the observations. It is the mean of the<br />

middle 50% of the observations.<br />

Recall that the median is the mean of the 1 or 2 middle observations. The<br />

trimmed mean often does a better job of representing the average of typical<br />

observations than does the median. Our parameter is the 25% trimmed mean<br />

of the population of all real estate sales prices in Seattle in 2002. By the plug-in<br />

principle, the statistic that estimates this parameter is the 25% trimmed mean<br />

of the sample prices in Table <strong>14</strong>.1. Because 25% of 50 is 12.5, we drop the 12<br />

lowest <strong>and</strong> 12 highest prices in Table <strong>14</strong>.1 <strong>and</strong> find the mean of the remaining<br />

26 prices. The statistic is (in thous<strong>and</strong>s of dollars)<br />

x 25% = 244.0019<br />

We can say little about the sampling distribution of the trimmed mean<br />

when we have only 50 observations from a strongly skewed distribution. Fortunately,<br />

we don’t need any distribution facts to use the bootstrap. We bootstrap<br />

the 25% trimmed mean just as we bootstrapped the sample mean: draw<br />

1000 resamples of size 50 from the 50 selling prices in Table <strong>14</strong>.1, calculate<br />

the 25% trimmed mean for each resample, <strong>and</strong> form the bootstrap distribution<br />

from these 1000 values.<br />

Figure <strong>14</strong>.7 shows the bootstrap distribution of the 25% trimmed mean.<br />

Here is the summary output from S-PLUS:<br />

Number of Replications: 1000<br />

Summary Statistics:<br />

Observed Mean Bias SE<br />

TrimMean 244 244.7 0.7171 16.83<br />

What do we see? Shape: The bootstrap distribution is roughly normal. This<br />

suggests that the sampling distribution of the trimmed mean is also roughly<br />

normal. Center: The bootstrap estimate of bias is 0.7171, small relative to the<br />

value 244 of the statistic. So the statistic (the trimmed mean of the sample)<br />

has small bias as an estimate of the parameter (the trimmed mean of the population).<br />

Spread: The bootstrap st<strong>and</strong>ard error of the statistic is<br />

SE boot = 16.83<br />

This is an estimate of the st<strong>and</strong>ard deviation of the sampling distribution of<br />

the trimmed mean.<br />

Recall the familiar one-sample t confidence interval (page 452) for the<br />

mean of a normal population:<br />

x ± t ∗ SE = x ± t ∗ √ s<br />

n<br />

This interval is based on the normal sampling distribution of the sample mean<br />

x <strong>and</strong> the formula SE = s/ √ n for the st<strong>and</strong>ard error of x. When a bootstrap

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!