27.03.2013 Views

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Competing models in a logistic regression can be formally<br />

compared by a likelihood ratio (LR) test, a score test or by<br />

Wald’s test. Details <strong>of</strong> these tests are given in Hosmer and<br />

Lemeshow (2000).<br />

The three tests are asymptotically equivalent but differ in finite<br />

samples. The likelihood ratio test is generally considered the<br />

most reliable, and the Wald test the least (see Therneau and<br />

Grambsch, 2000, for reasons), although in many practical applications<br />

the tests will all lead to the same conclusion.<br />

This is a convenient point to mention that logistic regression and the<br />

other modeling procedures used in earlier chapters, analysis <strong>of</strong> variance and<br />

multiple regression, can all be shown to be special cases <strong>of</strong> the generalized<br />

linear model formulation described in detail in McCullagh and Nelder (1989).<br />

This approach postulates a linear model for a suitable transformation <strong>of</strong> the<br />

expected value <strong>of</strong> a response variable and allows for a variety <strong>of</strong> different<br />

error distributions. The possible transformations are known as link functions.<br />

For multiple regression and analysis <strong>of</strong> variance, for example, the link<br />

function is simply the identity function, so the expected value is modeled<br />

directly, and the corresponding error distribution is normal. For logistic<br />

regression, the link is the logistic function and the appropriate error distribution<br />

is the binomial. Many other possibilities are opened up by the<br />

generalized linear model formulation — see McCullagh and Nelder (1989)<br />

for full details and <strong>Everitt</strong> (2002b) for a less technical account.<br />

9.3 Analysis Using <strong>SPSS</strong><br />

Our analyses <strong>of</strong> the Titanic data in Table 9.1 will focus on establishing<br />

relationships between the binary passenger outcome survival (measured<br />

by the variable survived with “1” indicating survival and “0” death) and<br />

five passenger characteristics that might have affected the chances <strong>of</strong><br />

survival, namely:<br />

Passenger class (variable pclass, with “1” indicating a first class<br />

ticket holder, “2” second class, and “3” third class)<br />

Passenger age (age recorded in years)<br />

Passenger gender (sex, with females coded “1” and males coded “2”)<br />

Number <strong>of</strong> accompanying parents/children (parch)<br />

Number <strong>of</strong> accompanying siblings/spouses (sibsp)<br />

Our investigation <strong>of</strong> the determinants <strong>of</strong> passenger survival will proceed<br />

in three steps. First, we assess (unadjusted) relationships between survival<br />

© 2004 by Chapman & Hall/CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!