13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.2 INFERENCE FOR LOGISTIC REGRESSION 107<br />

Exponentiating the endpoints yields an interval for e β , the multiplicative effect on the<br />

odds of a 1-unit increase in x.<br />

When n is small or fitted probabilities are mainly near 0 or 1, it is preferable<br />

<strong>to</strong> construct a confidence interval based on the likelihood-ratio test. This interval<br />

contains all the β0 values for which the likelihood-ratio test of H0: β = β0 has<br />

P -value >α. Some software can report this (such as PROC GENMOD in SAS with<br />

its LRCI option).<br />

For the logistic regression analysis of the horseshoe crab data, the estimated effect<br />

of width on the probability of a satellite is ˆβ = 0.497, with SE = 0.102. A 95% Wald<br />

confidence interval for β is 0.497 ± 1.96(0.102), or(0.298, 0.697). The likelihoodratio-based<br />

confidence interval is (0.308, 0.709). The likelihood-ratio interval for the<br />

effect on the odds per cm increase in width equals (e 0.308 , e 0.709 ) = (1.36, 2.03).We<br />

infer that a 1 cm increase in width has at least a 36 percent increase and at most a<br />

doubling in the odds that a female crab has a satellite.<br />

From Section 4.1.1, a simpler interpretation uses a straight-line approximation <strong>to</strong><br />

the logistic regression curve. The term βπ(x)[1 − π(x)] approximates the change<br />

in the probability per 1-unit increase in x. For instance, at π(x) = 0.50, the estimated<br />

rate of change is 0.25 ˆβ = 0.124. A 95% confidence interval for 0.25β equals<br />

0.25 times the endpoints of the interval for β. For the likelihood-ratio interval, this<br />

is [0.25(0.308), 0.25(0.709)] =(0.077, 0.177). So, if the logistic regression model<br />

holds, then for values of x near the width value at which π(x) = 0.50, we infer that<br />

the rate of increase in the probability of a satellite per centimeter increase in width<br />

falls between about 0.08 and 0.18.<br />

4.2.3 Significance Testing<br />

For the logistic regression model, H0: β = 0 states that the probability of success is<br />

independent of X. Wald test statistics (Section 1.4.1) are simple. For large samples,<br />

z = ˆβ/SE<br />

has a standard normal distribution when β = 0. Refer z <strong>to</strong> the standard normal table<br />

<strong>to</strong> get a one-sided or two-sided P -value. Equivalently, for the two-sided Ha: β �= 0,<br />

z 2 = ( ˆβ/SE) 2 has a large-sample chi-squared null distribution with df = 1.<br />

Although the Wald test is adequate for large samples, the likelihood-ratio test is<br />

more powerful and more reliable for sample sizes often used in practice. The test<br />

statistic compares the maximum L0 of the log-likelihood function when β = 0<strong>to</strong><br />

the maximum L1 of the log-likelihood function for unrestricted β. The test statistic,<br />

−2(L0 − L1), also has a large-sample chi-squared null distribution with df = 1.<br />

For the horseshoe crab data, the Wald statistic z = ˆβ/SE = 0.497/0.102 =<br />

4.9. This shows strong evidence of a positive effect of width on the presence of<br />

satellites (P

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!