13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

192 MULTICATEGORY LOGIT MODELS<br />

Table 6.11. Outcomes for Pregnant Mice in Developmental<br />

Toxicity Study a<br />

Concentration<br />

Response<br />

(mg/kg per day) Non-live Malformation Normal<br />

0 (controls) 15 1 281<br />

62.5 17 0 225<br />

125 22 7 283<br />

250 38 59 202<br />

500 144 132 9<br />

a Based on results in C. J. Price et al., Fund. Appl. Toxicol., 8: 115–126,<br />

1987. I thank Dr. Louise Ryan for showing me these data.<br />

For instance, given that a fetus was live, for every 100-unit increase in concentration<br />

level, the estimated odds that it was malformed rather than normal changes by a<br />

multiplicative fac<strong>to</strong>r of exp(100 × 0.0174) = 5.7.<br />

When models for different continuation-ratio logits have separate parameters, as<br />

in this example, separate fitting of ordinary binary logistic regression models for<br />

different logits gives the same results as simultaneous fitting. The sum of the separate<br />

deviance statistics is an overall goodness-of-fit statistic pertaining <strong>to</strong> the simultaneous<br />

fitting. For Table 6.11, the deviance G 2 values are 5.8 for the first logit and 6.1 for<br />

the second, each based on df = 3. We summarize the fit by their sum, G 2 = 11.8,<br />

based on df = 6 (P = 0.07).<br />

6.3.5 Overdispersion in Clustered <strong>Data</strong><br />

The above analysis treats pregnancy outcomes for different fetuses as independent<br />

observations. In fact, each pregnant mouse had a litter of fetuses, and statistical<br />

dependence may exist among different fetuses from the same litter. The model<br />

also treats fetuses from different litters at a given concentration level as having<br />

the same response probabilities. Heterogeneity of various types among the litters<br />

(for instance, due <strong>to</strong> different physical conditions of different pregnant mice) would<br />

usually cause these probabilities <strong>to</strong> vary somewhat among litters. Either statistical<br />

dependence or heterogeneous probabilities violates the binomial assumption. They<br />

typically cause overdispersion – greater variation than the binomial model implies<br />

(recall Section 3.3.3).<br />

For example, consider mice having litters of size 10 at a fixed, low concentration<br />

level. Suppose the average probability of fetus death is low, but some mice have<br />

genetic defects that cause fetuses in their litter <strong>to</strong> have a high probability of death.<br />

Then, the number of fetuses that die in a litter may vary among pregnant mice <strong>to</strong><br />

a greater degree than if the counts were based on identical probabilities. We might<br />

observe death counts (out of 10 fetuses in a litter) such as 1, 0, 1, 10, 0, 2, 10, 1, 0,<br />

10; this is more variability than we expect with binomial variates.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!