13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

114 LOGISTIC REGRESSION<br />

4.3.4 The Cochran–Mantel–Haenszel Test for 2 × 2 × K<br />

Contingency Tables ∗<br />

In many examples with two categorical predic<strong>to</strong>rs, X identifies two groups <strong>to</strong> compare<br />

and Z is a control variable. For example, in a clinical trial X might refer <strong>to</strong> two<br />

treatments and Z might refer <strong>to</strong> several centers that recruited patients for the study.<br />

Problem 4.20 shows such an example. The data then can be presented in several 2 × 2<br />

tables.<br />

With K categories for Z, model (4.7) refers <strong>to</strong> a 2 × 2 × K contingency table.<br />

That model can then be expressed as<br />

logit[P(Y = 1)] =α + βx + β Z k<br />

(4.8)<br />

where x is an indica<strong>to</strong>r variable for the two categories of X. Then, exp(β) is the<br />

common XY odds ratio for each of the K partial tables for categories of Z. This<br />

is the homogeneous association structure for multiple 2 × 2 tables, introduced in<br />

Section 2.7.6.<br />

In this model, conditional independence between X and Y , controlling for Z,<br />

corresponds <strong>to</strong> β = 0. When β = 0, the XY odds ratio equals 1 for each partial table.<br />

Given that model (4.8) holds, one can test conditional independence by the Wald test<br />

or the likelihood-ratio test of H0: β = 0.<br />

The Cochran–Mantel–Haenszel test is an alternative test of XY conditional independence<br />

in 2 × 2 × K contingency tables. This test conditions on the row <strong>to</strong>tals and<br />

the column <strong>to</strong>tals in each partial table. Then, as in Fisher’s exact test, the count in<br />

the first row and first column in a partial table determines all the other counts in that<br />

table. Under the usual sampling schemes (e.g., binomial for each row in each partial<br />

table), the conditioning results in a hypergeometric distribution (Section 2.6.1) for<br />

the count n11k in the cell in row 1 and column 1 of partial table k. The test statistic<br />

utilizes this cell in each partial table.<br />

In partial table k, the row <strong>to</strong>tals are {n1+k,n2+k}, and the column <strong>to</strong>tals are<br />

{n+1k,n+2k}. Given these <strong>to</strong>tals, under H0,<br />

μ11k = E(n11k) = n1+kn+1k/n++k<br />

Var(n11k) = n1+kn2+kn+1kn+2k/n 2 ++k (n++k − 1)<br />

The Cochran–Mantel–Haenszel (CMH) test statistic summarizes the information<br />

from the K partial tables using<br />

CMH =<br />

��<br />

k (n11k − μ11k) � 2<br />

�<br />

k Var(n11k)<br />

(4.9)<br />

This statistic has a large-sample chi-squared null distribution with df = 1. The approximation<br />

improves as the <strong>to</strong>tal sample size increases, regardless of whether the number<br />

of strata K is small or large.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!