01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>14</strong>-18 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

where SE boot is the bootstrap st<strong>and</strong>ard error for this statistic <strong>and</strong> t ∗ is<br />

the critical value of the t(n − 1) distribution with area C between −t ∗<br />

<strong>and</strong> t ∗ .<br />

We want to estimate the 25% trimmed mean of the population of all<br />

EXAMPLE <strong>14</strong>.5<br />

2002 Seattle real estate selling prices. Table <strong>14</strong>.1 gives an SRS of size<br />

n = 50. The software output above shows that the trimmed mean of this sample is<br />

x 25% = 244 <strong>and</strong> that the bootstrap st<strong>and</strong>ard error of this statistic is SE boot = 16.83. A<br />

95% confidence interval for the population trimmed mean is therefore<br />

x 25% ± t ∗ SE boot = 244 ± (2.009)(16.83)<br />

= 244 ± 33.81<br />

= (210.19, 277.81)<br />

Because Table D does not have entries for n − 1 = 49 degrees of freedom, we used<br />

t ∗ = 2.009, the entry for 50 degrees of freedom.<br />

We are 95% confident that the 25% trimmed mean (the mean of the middle 50%)<br />

for the population of real estate sales in Seattle in 2002 is between $210,190 <strong>and</strong><br />

$277,810.<br />

<strong>Bootstrap</strong>ping to compare two groups<br />

Two-sample problems (Section 7.2) are among the most common statistical<br />

settings. In a two-sample problem, we wish to compare two populations, such<br />

as male <strong>and</strong> female college students, based on separate samples from each<br />

population. When both populations are roughly normal, the two-sample t procedures<br />

compare the two population means. The bootstrap can also compare<br />

two populations, without the normality condition <strong>and</strong> without the restriction<br />

to comparison of means. The most important new idea is that bootstrap resampling<br />

must mimic the “separate samples” design that produced the original<br />

data.<br />

BOOTSTRAP FOR COMPARING TWO POPULATIONS<br />

Given independent SRSs of sizes n <strong>and</strong> m from two populations:<br />

1. Draw a resample of size n with replacement from the first sample<br />

<strong>and</strong> a separate resample of size m from the second sample. Compute a<br />

statistic that compares the two groups, such as the difference between<br />

the two sample means.<br />

2. Repeat this resampling process hundreds of times.<br />

3. Construct the bootstrap distribution of the statistic. Inspect its<br />

shape, bias, <strong>and</strong> bootstrap st<strong>and</strong>ard error in the usual way.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!