13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

8.3 COMPARING MARGINS OF SQUARE CONTINGENCY TABLES 253<br />

Let πij = P(Y1 = i, Y2 = j). Marginal homogeneity is<br />

P(Y1 = i) = P(Y2 = i) for i = 1,...,I<br />

that is, each row marginal probability equals the corresponding column marginal<br />

probability.<br />

8.3.1 Marginal Homogeneity and Nominal Classifications<br />

One way <strong>to</strong> test H0: marginal homogeneity compares ML fitted values {ˆμij } that<br />

satisfy marginal homogeneity <strong>to</strong> {nij } using G2 or X2 statistics. The df = I − 1.<br />

The ML fit of marginal homogeneity is obtained iteratively.<br />

Another way generalizes the McNemar test. It tests H0: marginal homogeneity by<br />

exploiting the large-sample normality of marginal proportions. Let di = pi+ − p+i<br />

compare the marginal proportions in column i and row i. Let d be a vec<strong>to</strong>r of the first<br />

I − 1 differences. It is redundant <strong>to</strong> include dI , since � di = 0. Under H0, E(d) = 0<br />

and the estimated covariance matrix of d is ˆV0/n, where ˆV0 has elements<br />

ˆvij 0 =−(pij + pji) for i �= j<br />

ˆvii0 = pi+ + p+i − 2pii<br />

Now, d has a large-sample multivariate normal distribution. The quadratic form<br />

W0 = nd ′ ˆV −1<br />

0 d (8.6)<br />

is a score test statistic. It is asymp<strong>to</strong>tically chi-squared with df = I − 1. For I = 2,<br />

W0 simplifies <strong>to</strong> the McNemar statistic, the square of equation (8.1).<br />

8.3.2 Example: Coffee Brand Market Share<br />

A survey recorded the brand choice for a sample of buyers of instant coffee. At a later<br />

coffee purchase by these subjects, the brand choice was again recorded. Table 8.5<br />

shows results for five brands of decaffinated coffee. The cell counts on the “main<br />

diagonal” (the cells for which the row variable outcome is the same as the column<br />

variable outcome) are relatively large. This indicates that most buyers did not change<br />

their brand choice.<br />

The table also shows the ML fitted values that satisfy marginal homogeneity.<br />

Comparing these <strong>to</strong> the observed cell counts gives G 2 = 12.6 and X 2 = 12.4(df = 4).<br />

The P -values are less than 0.015 for testing H0: marginal homogeneity. (Table A.11<br />

in the Appendix shows how software can obtain the ML fit and test statistics.) The<br />

statistic (8.6) using differences in sample marginal proportions gives similar results,<br />

equaling 12.3 with df = 4.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!