13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

288 MODELING CORRELATED, CLUSTERED RESPONSES<br />

Analyses when many data are missing should be made with caution. At a minimum,<br />

one should compare results of the analysis using all available cases for all clusters<br />

<strong>to</strong> the analysis using only clusters having no missing observations. If results differ<br />

substantially, conclusions should be tentative until the reasons for missingness can<br />

be studied.<br />

9.4 TRANSITIONAL MODELING, GIVEN THE PAST<br />

Let Yt denote the response at time t, t = 1, 2,...,in a longitudinal study. Some studies<br />

focus on the dependence of Yt on the previously observed responses {y1,y2,...,yt−1}<br />

as well as any explana<strong>to</strong>ry variables. Models that include past observations as<br />

predic<strong>to</strong>rs are called transitional models.<br />

A Markov chain is a transitional model for which, for all t, the conditional distribution<br />

of Yt,givenY1,...,Yt−1, is assumed identical <strong>to</strong> the conditional distribution of Yt<br />

given Yt−1 alone. That is, given Yt−1, Yt is conditionally independent of Y1,...,Yt−2.<br />

Knowing the most recent observation, information about previous observations before<br />

that one does not help with predicting the next observation. A Markov chain model<br />

is adequate for modeling Yt if the model with yt−1 as the only past observation used<br />

as a predic<strong>to</strong>r fits as well, for practical purposes, as a model with {y1,y2,...,yt−1}<br />

as predic<strong>to</strong>rs.<br />

9.4.1 Transitional Models with Explana<strong>to</strong>ry Variables<br />

Transitional models usually also include explana<strong>to</strong>ry variables other than past observations.<br />

With binary y and k such explana<strong>to</strong>ry variables, one might specify a logistic<br />

regression model for each t,<br />

logit[P(Yt = 1)] =α + βyt−1 + β1x1 +···+βkxk<br />

This Markov chain model is called a regressive logistic model. Given the predic<strong>to</strong>r<br />

values, the model treats repeated observations by a subject as independent. Thus, one<br />

can fit the model with ordinary GLM software, treating each observation separately.<br />

This model generalizes so a predic<strong>to</strong>r xj can take a different value for each t.<br />

For example, in a longitudinal medical study, a subject’s values for predic<strong>to</strong>rs such<br />

as blood pressure could change over time. A higher-order Markov model could also<br />

include in the predic<strong>to</strong>r set yt−2 and possibly other previous observations.<br />

9.4.2 Example: Respira<strong>to</strong>ry Illness and Maternal Smoking<br />

Table 9.8 is from the Harvard study of air pollution and health. At ages 7–10 children<br />

were evaluated annually on whether they had a respira<strong>to</strong>ry illness. Explana<strong>to</strong>ry variables<br />

are the age of the child t (t = 7, 8, 9, 10) and maternal smoking at the start of<br />

the study (s = 1 for smoking regularly, s = 0 otherwise).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!