13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.4 MULTIPLE LOGISTIC REGRESSION 119<br />

This suggests that another potential color scoring for model (4.12) is {1, 1, 1, 0}; that<br />

is, c = 0 for dark-colored crabs, and c = 1 otherwise. The likelihood-ratio statistic<br />

comparing model (4.12) with these binary scores <strong>to</strong> model (4.11) with color treated<br />

as nominal scale equals 0.5, based on df = 2. So, this simpler model is also adequate<br />

(P = 0.78). This model has a color estimate of 1.300 (SE = 0.525). At a given width,<br />

the estimated odds that a lighter-colored crab has a satellite are exp(1.300) = 3.7 times<br />

the estimated odds for a dark crab.<br />

In summary, the nominal-scale model, the quantitative model with color scores<br />

{1, 2, 3, 4}, and the model with binary color scores {1, 1, 1, 0} all suggest that dark<br />

crabs are least likely <strong>to</strong> have satellites. When the sample size is not very large, it is<br />

not unusual that several models fit adequately.<br />

It is advantageous <strong>to</strong> treat ordinal predic<strong>to</strong>rs in a quantitative manner, when such<br />

models fit well. The model is simpler and easier <strong>to</strong> interpret, and tests of the effect<br />

of the ordinal predic<strong>to</strong>r are generally more powerful when it has a single parameter<br />

rather than several parameters.<br />

4.4.4 Allowing Interaction<br />

The models we have considered so far assume a lack of interaction between width and<br />

color. Let us check now whether this is sensible. We can allow interaction by adding<br />

cross products of terms for width and color. Each color then has a different-shaped<br />

curve relating width <strong>to</strong> the probability of a satellite, so a comparison of two colors<br />

varies according <strong>to</strong> the value of width.<br />

For example, consider the model just discussed that has a dummy variable c = 0<br />

for dark-colored crabs and c = 1 otherwise. The model with an interaction term has<br />

the prediction equation<br />

logit[ ˆ<br />

P(Y = 1)] =−5.854 − 6.958c + 0.200x + 0.322(c × x)<br />

Let us see what this implies about the prediction equations for each color. For dark<br />

crabs, c = 0 and<br />

For lighter crabs, c = 1 and<br />

logit[ ˆ<br />

P(Y = 1)] =−5.854 + 0.200x<br />

logit[ ˆ<br />

P(Y = 1)] =−12.812 + 0.522x<br />

The curve for lighter crabs has a faster rate of increase. The curves cross at x such that<br />

−5.854 + 0.200x =−12.812 + 0.522x, that is, at x = 21.6 cm. The sample widths<br />

range between 21.0 and 33.5 cm, so the lighter-colored crabs have a higher estimated<br />

probability of a satellite over essentially the entire range.<br />

We can compare this <strong>to</strong> the simpler model without interaction <strong>to</strong> analyze whether<br />

the fit is significantly better. The likelihood-ratio statistic comparing the models equals

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!