13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.4 MULTIPLE LOGISTIC REGRESSION 115<br />

When the true odds ratio exceeds 1.0 in partial table k, we expect<br />

(n11k − μ11k) >0. The test statistic combines these differences across all K tables,<br />

and we then expect the sum of such differences <strong>to</strong> be a relatively large positive number.<br />

When the odds ratio is less than 1.0 in each table, the sum of such differences<br />

tends <strong>to</strong> be a relatively large negative number. The CMH statistic takes larger values<br />

when (n11k − μ11k) is consistently positive or consistently negative for all tables,<br />

rather than positive for some and negative for others. The test works best when the<br />

XY association is similar in each partial table.<br />

This test was proposed in 1959, well before logistic regression was popular. The<br />

formula for the CMH test statistic seems <strong>to</strong> have nothing <strong>to</strong> do with modeling. In fact,<br />

though, the CMH test is the score test (Section 1.4.1) of XY conditional independence<br />

for model (4.8). Recall that model assumes a common odds ratio for the partial tables<br />

(i.e., homogeneous association). Similarity of results for the likelihood-ratio, Wald,<br />

and CMH (score) tests usually happens when the sample size is large.<br />

For Table 4.4 from the AZT and AIDS study, consider H0: conditional independence<br />

between immediate AZT use and AIDS symp<strong>to</strong>m development. Section 4.3.2<br />

noted that the likelihood-ratio test statistic is −2(L0 − L1) = 6.9 and the Wald test<br />

statistic is ( ˆβ1/SE) 2 = 6.6, each with df = 1. The CMH statistic (4.9) equals 6.8,<br />

also with df = 1, giving similar results (P = 0.01).<br />

4.3.5 Testing the Homogeneity of Odds Ratios ∗<br />

Model (4.8) and its special case (4.5) when Z is also binary have the homogeneous<br />

association property of a common XY odds ratio at each level of Z. Sometimes it<br />

is of interest <strong>to</strong> test the hypothesis of homogeneous association (although it is not<br />

necessary <strong>to</strong> do so <strong>to</strong> justify using the CMH test). A test of homogeneity of the odds<br />

ratios is, equivalently, a test of the goodness of fit of model (4.8). Section 5.2.2 will<br />

show how <strong>to</strong> do this.<br />

Some software reports a test, called the Breslow–Day test, that is a chi-squared<br />

test specifically designed <strong>to</strong> test homogeneity of odds ratios. It has the form of<br />

a Pearson chi-squared statistic, comparing the observed cell counts <strong>to</strong> estimated<br />

expected frequencies that have a common odds ratio. This test is an alternative <strong>to</strong><br />

the goodness-of-fit tests of Section 5.2.2.<br />

4.4 MULTIPLE LOGISTIC REGRESSION<br />

Next we will consider the general logistic regression model with multiple explana<strong>to</strong>ry<br />

variables. Denote the k predic<strong>to</strong>rs for a binary response Y by x1,x2,...,xk. The<br />

model for the log odds is<br />

logit[P(Y = 1)] =α + β1x1 + β2x2 +···+βkxk<br />

(4.10)<br />

The parameter βi refers <strong>to</strong> the effect of xi on the log odds that Y = 1, controlling<br />

the other xs. For example, exp(βi) is the multiplicative effect on the odds of a 1-unit<br />

increase in xi, at fixed levels of the other xs.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!