01.04.2014 Views

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

Chapter 14 - Bootstrap Methods and Permutation Tests - WH Freeman

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>14</strong>-44 CHAPTER <strong>14</strong> <strong>Bootstrap</strong> <strong>Methods</strong> <strong>and</strong> <strong>Permutation</strong> <strong>Tests</strong><br />

<strong>14</strong>.33 The distribution of the 72 guinea pig lifetimes in Table 1.8 (page 38) is strongly<br />

skewed. In Exercise <strong>14</strong>.9 (page <strong>14</strong>-23) you found a bootstrap t confidence interval<br />

for the population mean µ, even though some skewness remains in the<br />

bootstrap distribution. <strong>Bootstrap</strong> the mean lifetime <strong>and</strong> give all four bootstrap<br />

95% confidence intervals: t, percentile, BCa, <strong>and</strong> tilting. Make a graphical<br />

comparison by drawing a vertical line at the original sample mean x <strong>and</strong><br />

displaying the four intervals horizontally, one above the other. Discuss what<br />

you see: Do bootstrap t <strong>and</strong> percentile agree? Do the more accurate intervals<br />

agree with the two simpler methods?<br />

<strong>14</strong>.34 We would like a 95% confidence interval for the st<strong>and</strong>ard deviation σ of<br />

Seattle real estate prices. Your work in Exercise <strong>14</strong>.11 probably suggests<br />

that it is risky to bootstrap the sample st<strong>and</strong>ard deviation s from the sample<br />

in Table <strong>14</strong>.1 <strong>and</strong> use the bootstrap t interval. Now we have more accurate<br />

methods. <strong>Bootstrap</strong> s <strong>and</strong> report all four bootstrap 95% confidence intervals:<br />

t, percentile, BCa, <strong>and</strong> tilting. Make a graphical comparison by drawing a vertical<br />

line at the original s <strong>and</strong> displaying the four intervals horizontally, one<br />

above the other. Discuss what you see: Do bootstrap t <strong>and</strong> percentile agree?<br />

Do the more accurate intervals agree with the two simpler methods? What<br />

interval would you use in a report on real estate prices?<br />

CHALLENGE<br />

CHALLENGE<br />

<strong>14</strong>.35 Exercise <strong>14</strong>.7 (page <strong>14</strong>-13) gives an SRS of 20 of the 72 guinea pig survival<br />

times in Table 1.8. The bootstrap distribution of x from this sample is clearly<br />

right-skewed. Give a 95% confidence interval for the population mean µ based<br />

on these data <strong>and</strong> a method of your choice. Describe carefully how your result<br />

differs from the intervals in Exercise <strong>14</strong>.33, which use the full sample of 72<br />

lifetimes.<br />

<strong>14</strong>.36 The CLEC data for Example <strong>14</strong>.6 are strongly skewed to the right. The 23<br />

CLEC repair times appear in Exercise <strong>14</strong>.22 (page <strong>14</strong>-26).<br />

(a) <strong>Bootstrap</strong> the mean of the data. Based on the bootstrap distribution,<br />

which bootstrap confidence intervals would you consider for use? Explain<br />

your answer.<br />

(b) Find all four bootstrap confidence intervals. How do the intervals compare?<br />

Briefly explain the reasons for any differences. In particular, what<br />

kind of errors would you make in estimating the mean repair time for all<br />

CLEC customers by using a t interval or percentile interval instead of a<br />

tilting or BCa interval?<br />

<strong>14</strong>.37 Example <strong>14</strong>.6 (page <strong>14</strong>-19) considers the mean difference between repair<br />

times for Verizon (ILEC) customers <strong>and</strong> customers of competing carriers<br />

(CLECs). The bootstrap distribution is nonnormal with strong left skewness,<br />

so that any t confidence interval is inappropriate. Give the BCa 95% confidence<br />

interval for the mean difference in service times for all customers. In<br />

practical terms, what kind of error would you make by using a t interval or<br />

percentile interval instead of a BCa interval?<br />

<strong>14</strong>.38 Figure 2.3 (page 108) is a scatterplot of field versus laboratory measurements<br />

of the depths of 100 defects in the Trans-Alaska Oil Pipeline. The correlation<br />

is r = 0.944. <strong>Bootstrap</strong> the correlation for these data. (The data are in the file<br />

ex<strong>14</strong> 038.dat.)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!