Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>14</strong>-18 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />
where SE boot is the bootstrap st<strong>and</strong>ard error for this statistic <strong>and</strong> t ∗ is<br />
the critical value of the t(n − 1) distribution with area C between −t ∗<br />
<strong>and</strong> t ∗ .<br />
We want to estimate the 25% trimmed mean of the population of all<br />
EXAMPLE <strong>14</strong>.5<br />
2002 Seattle real estate selling prices. Table <strong>14</strong>.1 gives an SRS of size<br />
n = 50. The software output above shows that the trimmed mean of this sample is<br />
x 25% = 244 <strong>and</strong> that the bootstrap st<strong>and</strong>ard error of this statistic is SE boot = 16.83. A<br />
95% confidence interval for the population trimmed mean is therefore<br />
x 25% ± t ∗ SE boot = 244 ± (2.009)(16.83)<br />
= 244 ± 33.81<br />
= (210.19, 277.81)<br />
Because Table D does not have entries for n − 1 = 49 degrees of freedom, we used<br />
t ∗ = 2.009, the entry for 50 degrees of freedom.<br />
We are 95% confident that the 25% trimmed mean (the mean of the middle 50%)<br />
for the population of real estate sales in Seattle in 2002 is between $210,190 <strong>and</strong><br />
$277,810.<br />
<strong>Bootstrap</strong>ping to compare two groups<br />
Two-sample problems (Section 7.2) are among the most common statistical<br />
settings. In a two-sample problem, we wish to compare two populations, such<br />
as male <strong>and</strong> female college students, based on separate samples from each<br />
population. When both populations are roughly normal, the two-sample t procedures<br />
compare the two population means. The bootstrap can also compare<br />
two populations, without the normality condition <strong>and</strong> without the restriction<br />
to comparison of means. The most important new idea is that bootstrap resampling<br />
must mimic the “separate samples” design that produced the original<br />
data.<br />
BOOTSTRAP FOR COMPARING TWO POPULATIONS<br />
Given independent SRSs of sizes n <strong>and</strong> m from two populations:<br />
1. Draw a resample of size n with replacement from the first sample<br />
<strong>and</strong> a separate resample of size m from the second sample. Compute a<br />
statistic that compares the two groups, such as the difference between<br />
the two sample means.<br />
2. Repeat this resampling process hundreds of times.<br />
3. Construct the bootstrap distribution of the statistic. Inspect its<br />
shape, bias, <strong>and</strong> bootstrap st<strong>and</strong>ard error in the usual way.