13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

148 BUILDING AND APPLYING LOGISTIC REGRESSION MODELS<br />

With categorical predic<strong>to</strong>rs, we can use residuals <strong>to</strong> compare observed and fitted<br />

counts. This should be done with the grouped form of the data. Let yi denote the<br />

number of “successes” for ni trials at setting i of the explana<strong>to</strong>ry variables. Let ˆπi<br />

denote the estimated probability of success for the model fit. Then, the estimated<br />

binomial mean ni ˆπi is the fitted number of successes.<br />

For a GLM with binomial random component, the Pearson residual (3.9)<br />

comparing yi <strong>to</strong> its fit is<br />

Pearson residual = ei =<br />

yi − ni ˆπi<br />

� [ni ˆπi(1 −ˆπi)]<br />

Each Pearson residual divides the difference between an observed count and its fitted<br />

value by the estimated binomial standard deviation of the observed count. When ni<br />

is large, ei has an approximate normal distribution. When the model holds, {ei} has<br />

an approximate expected value of zero but a smaller variance than a standard normal<br />

variate.<br />

The standardized residual divides (yi − ni ˆπi) by its SE,<br />

standardized residual = yi − ni ˆπi<br />

SE<br />

=<br />

yi − ni ˆπi<br />

� [ni ˆπi(1 −ˆπi)(1 − hi)]<br />

The term hi in this formula is the observation’s leverage, its element from the diagonal<br />

of the so-called hat matrix. (Roughly speaking, the hat matrix is a matrix<br />

that, when applied <strong>to</strong> the sample logits, yields the predicted logit values for the<br />

model.) The greater an observation’s leverage, the greater its potential influence on<br />

the model fit.<br />

The standardized residual equals ei/ √ (1 − hi), so it is larger in absolute value<br />

than the Pearson residual ei. Itis approximately standard normal when the model<br />

holds. We prefer it. An absolute value larger than roughly 2 or 3 provides evidence of<br />

lack of fit. This serves the same purpose as the standardized residual (2.9) defined in<br />

Section 2.4.5 for detecting patterns of dependence in two-way contingency tables. It<br />

is a special case of the standardized residual presented in Section 3.4.5 for describing<br />

lack of fit in GLMs.<br />

When fitted values are very small, we have noted that X 2 and G 2 do not have<br />

approximate null chi-squared distributions. Similarly, residuals have limited meaning<br />

in that case. For ungrouped binary data and often when explana<strong>to</strong>ry variables are<br />

continuous, each ni = 1. Then, yi can equal only 0 or 1, and a residual can assume only<br />

two values and is usually uninformative. Plots of residuals also then have limited use,<br />

consisting merely of two parallel lines of dots. The deviance itself is then completely<br />

uninformative about model fit. When data can be grouped in<strong>to</strong> sets of observations<br />

having common predic<strong>to</strong>r values, it is better <strong>to</strong> compute residuals for the grouped<br />

data than for individual subjects.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!