13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.4 TESTS OF CONDITIONAL INDEPENDENCE 195<br />

model-based statistics without interaction terms, these statistics perform well when<br />

the conditional association is similar in each partial table. There are three versions,<br />

according <strong>to</strong> whether both, one, or neither of Y and the predic<strong>to</strong>r are treated as<br />

ordinal.<br />

When both variables are ordinal, the test statistic generalizes the correlation statistic<br />

(2.10) for two-way tables. It is designed <strong>to</strong> detect a linear trend in the association that<br />

has the same direction in each partial table. The generalized correlation statistic<br />

has approximately a chi-squared distribution with df = 1. Its formula is complex<br />

and we omit computational details. It is available in standard software (e.g., PROC<br />

FREQ in SAS).<br />

For Table 6.12 with the scores {3, 10, 20, 35} for income and {1, 3, 4, 5} for satisfaction,<br />

the sample correlation between income and job satisfaction equals 0.16 for<br />

females and 0.37 for males. The generalized correlation statistic equals 6.2 with<br />

df = 1(P = 0.01). This gives the same conclusion as the ordinal-model-based<br />

likelihood-ratio test of the previous subsection.<br />

When in doubt about scoring, perform a sensitivity analysis by using a few different<br />

choices that seem sensible. Unless the categories exhibit severe imbalance in their<br />

<strong>to</strong>tals, the choice of scores usually has little impact on the results. With the row<br />

and column numbers as the scores, the sample correlation equals 0.17 for females<br />

and 0.38 for males, and the generalized correlation statistic equals 6.6 with df = 1<br />

(P = 0.01), giving the same conclusion.<br />

6.4.3 Detecting Nominal–Ordinal Conditional Association ∗<br />

When the predic<strong>to</strong>r is nominal and Y is ordinal, scores are relevant only for levels of<br />

Y . We summarize the responses of subjects within a given row by the mean of their<br />

scores on Y , and then average this row-wise mean information across the K strata.<br />

The test of conditional independence compares the I rows using a statistic based on<br />

the variation among the I averaged row mean responses. This statistic is designed <strong>to</strong><br />

detect differences among their true mean values. It has a large-sample chi-squared<br />

distribution with df = (I − 1).<br />

For Table 6.12, this test treats job satisfaction as ordinal and income as nominal.<br />

The test searches for differences among the four income levels in their mean job<br />

satisfaction. Using scores {1, 2, 3, 4}, the mean job satisfaction at the four levels<br />

of income equal (2.82, 2.84, 3.29, 3.00) for females and (2.60, 2.78, 3.30, 3.31) for<br />

males. For instance, the mean for the 17 females with income

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!