Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>14</strong>-44 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />
<strong>14</strong>.33 The distribution of the 72 guinea pig lifetimes in Table 1.8 (page 38) is strongly<br />
skewed. In Exercise <strong>14</strong>.9 (page <strong>14</strong>-23) you found a bootstrap t confidence interval<br />
for the population mean µ, even though some skewness remains in the<br />
bootstrap distribution. <strong>Bootstrap</strong> the mean lifetime <strong>and</strong> give all four bootstrap<br />
95% confidence intervals: t, percentile, BCa, <strong>and</strong> tilting. Make a graphical<br />
comparison by drawing a vertical line at the original sample mean x <strong>and</strong><br />
displaying the four intervals horizontally, one above the other. Discuss what<br />
you see: Do bootstrap t <strong>and</strong> percentile agree? Do the more accurate intervals<br />
agree with the two simpler methods?<br />
<strong>14</strong>.34 We would like a 95% confidence interval for the st<strong>and</strong>ard deviation σ of<br />
Seattle real estate prices. Your work in Exercise <strong>14</strong>.11 probably suggests<br />
that it is risky to bootstrap the sample st<strong>and</strong>ard deviation s from the sample<br />
in Table <strong>14</strong>.1 <strong>and</strong> use the bootstrap t interval. Now we have more accurate<br />
methods. <strong>Bootstrap</strong> s <strong>and</strong> report all four bootstrap 95% confidence intervals:<br />
t, percentile, BCa, <strong>and</strong> tilting. Make a graphical comparison by drawing a vertical<br />
line at the original s <strong>and</strong> displaying the four intervals horizontally, one<br />
above the other. Discuss what you see: Do bootstrap t <strong>and</strong> percentile agree?<br />
Do the more accurate intervals agree with the two simpler methods? What<br />
interval would you use in a report on real estate prices?<br />
CHALLENGE<br />
CHALLENGE<br />
<strong>14</strong>.35 Exercise <strong>14</strong>.7 (page <strong>14</strong>-13) gives an SRS of 20 of the 72 guinea pig survival<br />
times in Table 1.8. The bootstrap distribution of x from this sample is clearly<br />
right-skewed. Give a 95% confidence interval for the population mean µ based<br />
on these data <strong>and</strong> a method of your choice. Describe carefully how your result<br />
differs from the intervals in Exercise <strong>14</strong>.33, which use the full sample of 72<br />
lifetimes.<br />
<strong>14</strong>.36 The CLEC data for Example <strong>14</strong>.6 are strongly skewed to the right. The 23<br />
CLEC repair times appear in Exercise <strong>14</strong>.22 (page <strong>14</strong>-26).<br />
(a) <strong>Bootstrap</strong> the mean of the data. Based on the bootstrap distribution,<br />
which bootstrap confidence intervals would you consider for use? Explain<br />
your answer.<br />
(b) Find all four bootstrap confidence intervals. How do the intervals compare?<br />
Briefly explain the reasons for any differences. In particular, what<br />
kind of errors would you make in estimating the mean repair time for all<br />
CLEC customers by using a t interval or percentile interval instead of a<br />
tilting or BCa interval?<br />
<strong>14</strong>.37 Example <strong>14</strong>.6 (page <strong>14</strong>-19) considers the mean difference between repair<br />
times for Verizon (ILEC) customers <strong>and</strong> customers of competing carriers<br />
(CLECs). The bootstrap distribution is nonnormal with strong left skewness,<br />
so that any t confidence interval is inappropriate. Give the BCa 95% confidence<br />
interval for the mean difference in service times for all customers. In<br />
practical terms, what kind of error would you make by using a t interval or<br />
percentile interval instead of a BCa interval?<br />
<strong>14</strong>.38 Figure 2.3 (page 108) is a scatterplot of field versus laboratory measurements<br />
of the depths of 100 defects in the Trans-Alaska Oil Pipeline. The correlation<br />
is r = 0.944. <strong>Bootstrap</strong> the correlation for these data. (The data are in the file<br />
ex<strong>14</strong> 038.dat.)