13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

7.2 INFERENCE FOR LOGLINEAR MODELS 213<br />

and indicate poorer fits. The models that lack any association term fit poorly, having<br />

P -values below 0.001. The model (AC, AM, CM) that permits all pairwise associations<br />

but assumes homogeneous association fits well (P = 0.54). Table 7.6 shows the<br />

way PROC GENMOD in SAS reports the goodness-of-fit statistics for this model.<br />

7.2.2 Loglinear Cell Residuals<br />

Cell residuals show the quality of fit cell-by-cell. They show where a model fits<br />

poorly. Sometimes they indicate that certain cells display lack of fit in an otherwise<br />

good-fitting model. When a table has many cells, some residuals may be large purely<br />

by chance.<br />

Section 2.4.5 introduced standardized residuals for the independence model, and<br />

Section 3.4.5 discussed them generally for GLMs. They divide differences between<br />

observed and fitted counts by their standard errors. When the model holds, standardized<br />

residuals have approximately a standard normal distribution. Lack of fit is<br />

indicated by absolute values larger than about 2 when there are few cells or about 3<br />

when there are many cells.<br />

Table 7.8 shows standardized residuals for the model (AM, CM) ofAC conditional<br />

independence with Table 7.3. This model has df = 2 for testing fit. The two<br />

nonredundant residuals refer <strong>to</strong> checking AC independence at each level of M. The<br />

large residuals reflect the overall poor fit. [In fact, X2 relates <strong>to</strong> the two nonredundant<br />

residuals by X2 = (3.70) 2 + (12.80) 2 = 177.6.]. Extremely large residuals occur for<br />

students who have not smoked marijuana. For them, the positive residuals occur when<br />

A and C are both “yes” or both “no.” More of these students have used both or neither<br />

of alcohol and cigarettes than one would expect if their usage were conditionally<br />

independent. The same pattern persists for students who have smoked marijuana, but<br />

the differences between observed and fitted counts are then not as striking.<br />

Table 7.8 also shows standardized residuals for model (AC, AM, CM). Since df =<br />

1 for this model, only one residual is nonredundant. Both G2 and X2 are small, so<br />

Table 7.8. Standardized Residuals for Two Loglinear Models<br />

Drug Use<br />

Model (AM, CM) Model (AC, AM, CM)<br />

Observed Fitted Standardized Fitted Standardized<br />

A C M Count Count Residual Count Residual<br />

Yes Yes Yes 911 909.2 3.70 910.4 0.63<br />

No 538 438.8 12.80 538.6 −0.63<br />

No Yes 44 45.8 −3.70 44.6 −0.63<br />

No 456 555.2 −12.80 455.4 0.63<br />

No Yes Yes 3 4.8 −3.70 3.6 −0.63<br />

No 43 142.2 −12.80 42.4 0.63<br />

No Yes 2 0.2 3.70 1.4 0.63<br />

No 279 179.8 12.80 279.6 −0.63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!