13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

208 LOGLINEAR MODELS FOR CONTINGENCY TABLES<br />

Model (7.2) has a single constant parameter (λ), (I − 1) nonredundant λX i parameters,<br />

(J − 1) nonredundant λY j parameters, and (I − 1)(J − 1) nonredundant λXY<br />

ij<br />

parameters. The <strong>to</strong>tal number of parameters equals 1 + (I − 1) + (J − 1) + (I − 1)<br />

(J − 1) = IJ. The model has as many parameters as observed cell counts. It is<br />

the saturated loglinear model, having the maximum possible number of parameters.<br />

Because of this, it is the most general model for two-way tables. It describes perfectly<br />

any set of expected frequencies. It gives a perfect fit <strong>to</strong> the data. The estimated odds<br />

ratios just reported are the same as the sample odds ratios. In practice, unsaturated<br />

models are preferred, because their fit smooths the sample data and has simpler<br />

interpretations.<br />

When a model has two-fac<strong>to</strong>r terms, be cautious in interpreting the single-fac<strong>to</strong>r<br />

terms. By analogy with two-way ANOVA, when there is two-fac<strong>to</strong>r interaction, it can<br />

be misleading <strong>to</strong> report main effects. The estimates of the main effect terms depend<br />

on the coding scheme used for the higher-order effects, and the interpretation also<br />

depends on that scheme. Normally, we restrict our attention <strong>to</strong> the highest-order terms<br />

for a variable.<br />

7.1.4 Loglinear Models for Three-Way Tables<br />

With three-way contingency tables, loglinear models can represent various independence<br />

and association patterns. Two-fac<strong>to</strong>r association terms describe the conditional<br />

odds ratios between variables.<br />

For cell expected frequencies {μij k}, consider loglinear model<br />

log μij k = λ + λ X i + λY j + λZ k<br />

+ λXZ<br />

ik<br />

+ λYZ<br />

jk<br />

(7.4)<br />

Since it contains an XZ term (λXZ ik ), it permits association between X and Z, controlling<br />

for Y . This model also permits a YZ association, controlling for X. It does not contain<br />

an XY association term. This loglinear model specifies conditional independence<br />

between X and Y , controlling for Z.<br />

We symbolize this model by (XZ, YZ). The symbol lists the highest-order terms<br />

in the model for each variable. This model is an important one. It holds, for instance,<br />

if an association between two variables (X and Y ) disappears when we control for a<br />

third variable (Z).<br />

Models that delete additional association terms are <strong>to</strong>o simple <strong>to</strong> fit most data<br />

sets well. For instance, the model that contains only single-fac<strong>to</strong>r terms, denoted by<br />

(X, Y, Z), is called the mutual independence model. It treats each pair of variables as<br />

independent, both conditionally and marginally. When variables are chosen wisely<br />

for a study, this model is rarely appropriate.<br />

A model that permits all three pairs of variables <strong>to</strong> have conditional associations is<br />

log μij k = λ + λ X i + λY j + λZ k<br />

+ λXY<br />

ij<br />

+ λXZ<br />

ik<br />

+ λYZ<br />

jk<br />

(7.5)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!