01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>14</strong>-10 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

BOOTSTRAP STANDARD ERROR<br />

The bootstrap st<strong>and</strong>ard error SE boot of a statistic is the st<strong>and</strong>ard deviation<br />

of the bootstrap distribution of that statistic.<br />

Another thing that is new is that we don’t appeal to the central limit theorem<br />

or other theory to tell us that a sampling distribution is roughly normal.<br />

We look at the bootstrap distribution to see if it is roughly normal (or not). In<br />

most cases, the bootstrap distribution has approximately the same shape <strong>and</strong><br />

spread as the sampling distribution, but it is centered at the original statistic<br />

value rather than the parameter value. The bootstrap allows us to calculate<br />

st<strong>and</strong>ard errors for statistics for which we don’t have formulas <strong>and</strong> to check<br />

normality for statistics that theory doesn’t easily h<strong>and</strong>le.<br />

To apply the bootstrap idea, we must start with a statistic that estimates<br />

the parameter we are interested in. We come up with a suitable statistic by<br />

appealing to another principle that we have often applied without thinking<br />

about it.<br />

THE PLUG-IN PRINCIPLE<br />

To estimate a parameter, a quantity that describes the population, use<br />

the statistic that is the corresponding quantity for the sample.<br />

The plug-in principle tells us to estimate a population mean µ by the<br />

sample mean x <strong>and</strong> a population st<strong>and</strong>ard deviation σ by the sample st<strong>and</strong>ard<br />

deviation s. Estimate a population median by the sample median <strong>and</strong> a<br />

population regression line by the least-squares line calculated from a sample.<br />

The bootstrap idea itself is a form of the plug-in principle: substitute the data<br />

for the population, then draw samples (resamples) to mimic the process of<br />

building a sampling distribution.<br />

Using software<br />

Software is essential for bootstrapping in practice. Here is an outline of the<br />

program you would write if your software can choose r<strong>and</strong>om samples from<br />

a set of data but does not have bootstrap functions:<br />

Repeat 1000 times {<br />

Draw a resample with replacement from the data.<br />

Calculate the resample mean.<br />

Save the resample mean into a variable.<br />

}<br />

Make a histogram <strong>and</strong> normal quantile plot of the 1000 means.<br />

Calculate the st<strong>and</strong>ard deviation of the 1000 means.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!