13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

340 APPENDIX A: SOFTWARE FOR CATEGORICAL DATA ANALYSIS<br />

the δi parameters in equation (8.12). It takes a separate level for each cell on the<br />

main diagonal, and a common value for all other cells. The bot<strong>to</strong>m of Table A.12<br />

fits logit models for the data entered in the form of pairs of cell counts (nij ,nji).<br />

These six sets of binomial count are labeled as “above” and “below” with reference<br />

<strong>to</strong> the main diagonal. The variable defined as “score” is the distance<br />

(uj − ui) = j − i. The first model is symmetry and the second is ordinal quasi<br />

symmetry. Neither model contains an intercept (NOINT). The quasi-symmetry<br />

model can be fitted using the approach shown next for the equivalent Bradley–Terry<br />

model.<br />

TableA.13 uses GENMOD for logit fitting of the Bradley–Terry model <strong>to</strong> Table 8.9<br />

by forming an artificial explana<strong>to</strong>ry variable for each player. For a given observation,<br />

the variable for player i is 1 if he wins, −1 if he loses, and 0 if he is<br />

not one of the players for that match. Each observation lists the number of wins<br />

(“wins”) for the player with variate-level equal <strong>to</strong> 1 out of the number of matches<br />

(“n”) against the player with variate-level equal <strong>to</strong> −1. The model has these artificial<br />

variates, one of which is redundant, as explana<strong>to</strong>ry variables with no intercept<br />

term. The COVB option provides the estimated covariance matrix of parameter<br />

estima<strong>to</strong>rs.<br />

Table A.13. SAS Code for Fitting Bradley–Terry Model <strong>to</strong> Tennis <strong>Data</strong> in Table 8.9<br />

data tennnis;<br />

input win n agassi federer henman hewitt roddick ;<br />

datalines;<br />

0 6 1 -1 0 0 0<br />

0 0 1 0 -1 0 0<br />

...<br />

3 5 0 0 0 1 -1<br />

;<br />

proc genmod;<br />

model win/n = agassi federer henman hewitt roddick / dist=bin link=logit noint covb;<br />

CHAPTER 9: MODELING CORRELATED, CLUSTERED RESPONSES<br />

TableA.14 uses GENMOD <strong>to</strong> analyze the depression data in Table 9.1 using GEE. The<br />

REPEATED statement specifies the variable name (here, “case”) that identifies the<br />

subjects for each cluster. Possible working correlation structures are TYPE=EXCH<br />

for exchangeable, TYPE=AR for au<strong>to</strong>regressive, TYPE=INDEP for independence,<br />

and TYPE=UNSTR for unstructured. Output shows estimates and standard errors<br />

under the naive working correlation and incorporating the empirical dependence.<br />

Alternatively, the working association structure in the binary case can use the log<br />

odds ratio (e.g., using LOGOR=EXCH for exchangeability). The type3 option with<br />

the GEE approach provides score-type tests about effects. See S<strong>to</strong>kes et al. (2000,<br />

Section 15.11) for the use of GEE with missing data. See Table A.22 in Agresti (2002)<br />

for using GENMOD <strong>to</strong> implement GEE for a cumulative logit model for the insomnia<br />

data in Table 9.6.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!