13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

220 LOGLINEAR MODELS FOR CONTINGENCY TABLES<br />

Suppose Y is binary. We treat it as a response and X and Z as explana<strong>to</strong>ry. When X<br />

is at level i and Z is at level k,<br />

� � � �<br />

P(Y = 1) P(Y = 1 | X = i, Z = k)<br />

logit[P(Y = 1)] =log<br />

= log<br />

1 − P(Y = 1) P(Y = 2 | X = i, Z = k)<br />

� �<br />

μi1k<br />

= log = log(μi1k) − log(μi2k)<br />

μi2k<br />

= (λ + λ X i + λY 1 + λZ k<br />

− (λ + λ X i + λY 2 + λZ k<br />

= (λ Y 1 − λY2 ) + (λXY i1<br />

+ λXY<br />

i1<br />

+ λXY<br />

i2<br />

+ λXZ<br />

ik<br />

+ λXZ<br />

ik<br />

+ λYZ<br />

1k )<br />

+ λYZ<br />

2k )<br />

− λXY i2 ) + (λYZ 1k − λYZ 2k )<br />

The first parenthetical term does not depend on i or k. The second parenthetical term<br />

depends on the level i of X. The third parenthetical term depends on the level k of Z.<br />

The logit has the additive form<br />

logit[P(Y = 1)] =α + β X i + βZ k (7.7)<br />

Section 4.3.3 discussed this model, in which the logit depends on X and Z in an<br />

additive manner. Additivity on the logit scale is the standard definition of “no interaction”<br />

for categorical variables. When Y is binary, the loglinear model of homogeneous<br />

association is equivalent <strong>to</strong> this logistic regression model. When X is also binary,<br />

model (7.7) and loglinear model (XY, XZ, YZ) are characterized by equal odds ratios<br />

between X and Y at each of the K levels of Z.<br />

7.3.2 Example: Au<strong>to</strong> Accident <strong>Data</strong> Revisited<br />

For the data on Maine au<strong>to</strong> accidents (Table 7.9), Section 7.2.6 showed that loglinear<br />

model (GLS, GI, LI, IS) fits well. That model is<br />

log μgiℓs = λ + λ G g + λI i + λL ℓ + λS s<br />

+ λ LS<br />

ℓs<br />

+ λGLS<br />

gℓs<br />

+ λGI<br />

gi<br />

+ λGL<br />

gℓ<br />

+ λGS<br />

gs<br />

+ λIL<br />

iℓ<br />

+ λIS<br />

is<br />

(7.8)<br />

We could treat injury (I) as a response variable, and gender (G), location (L), and<br />

seat-belt use (S) as explana<strong>to</strong>ry variables. You can check that this loglinear model<br />

implies a logistic model of the form<br />

logit[P(I = 1)] =α + β G g + βL ℓ + βS s<br />

(7.9)<br />

Here, G, L, and S all affect I, but without interacting. The parameters in the two<br />

models are related by<br />

β G g<br />

= λGI<br />

g1<br />

− λGI<br />

g2 ,βL ℓ<br />

= λIL<br />

1ℓ<br />

− λIL<br />

2ℓ ,βS s<br />

= λIS<br />

1s<br />

− λIS<br />

2s

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!