13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.1 STRATEGIES IN MODEL SELECTION 141<br />

Table 5.2. Results of Fitting Several Logistic Regression Models <strong>to</strong> Horseshoe<br />

Crab <strong>Data</strong><br />

Models Deviance<br />

Model Predic<strong>to</strong>rs Deviance df AIC Compared Difference<br />

1 C ∗ S + C ∗ W + S ∗ W 173.7 155 209.7 –<br />

2 C + S + W 186.6 166 200.6 (2)–(1) 12.9 (df = 11)<br />

3a C + S 208.8 167 220.8 (3a)–(2) 22.2 (df = 1)<br />

3b S + W 194.4 169 202.4 (3b)–(2) 7.8 (df = 3)<br />

3c C + W 187.5 168 197.5 (3c)–(2) 0.9 (df = 2)<br />

4a C 212.1 169 220.1 (4a)–(3c) 24.6 (df = 1)<br />

4b W 194.5 171 198.5 (4b)–(3c) 7.0 (df = 3)<br />

5 C = dark + W 188.0 170 194.0 (5)–(3c) 0.5 (df = 2)<br />

6 None 225.8 172 227.8 (6)–(5) 37.8 (df = 2)<br />

Note: C = color, S = spine condition, W = width.<br />

(category 4) and the others. The simpler model that has a single dummy variable<br />

for color, equaling 0 for dark crabs and 1 otherwise, fits essentially as well [the<br />

deviance difference between models (5) and (3c) equals 0.5, with df = 2]. Further<br />

simplification results in large increases in the deviance and is unjustified.<br />

5.1.5 AIC, Model Selection, and the “Correct” Model<br />

In selecting a model, you should not think that you have found the “correct” one. Any<br />

model is a simplification of reality. For example, you should not expect width <strong>to</strong> have<br />

an exactly linear effect on the logit probability of satellites. However, a simple model<br />

that fits adequately has the advantages of model parsimony. If a model has relatively<br />

little bias, describing reality well, it provides good estimates of outcome probabilities<br />

and of odds ratios that describe effects of the predic<strong>to</strong>rs.<br />

Other criteria besides significance tests can help select a good model. The best<br />

known is the Akaike information criterion (AIC). It judges a model by how close<br />

its fitted values tend <strong>to</strong> be <strong>to</strong> the true expected values, as summarized by a certain<br />

expected distance between the two. The optimal model is the one that tends <strong>to</strong> have its<br />

fitted values closest <strong>to</strong> the true outcome probabilities. This is the model that minimizes<br />

AIC =−2(log likelihood − number of parameters in model)<br />

We illustrate this criterion using the models that Table 5.2 lists. For the model<br />

C + W , having main effects of color and width, software (PROC LOGISTIC in<br />

SAS) reports a −2 log likelihood value of 187.5. The model has five parameters –<br />

an intercept and a width effect and three coefficients of dummy variables for color.<br />

Thus, AIC = 187.5 + 2(5) = 197.5.<br />

Of models in Table 5.2 using some or all of the three basic predic<strong>to</strong>rs, AIC is<br />

smallest (AIC = 197.5) for C + W. The simpler model replacing C by an indica<strong>to</strong>r

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!