01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

486 Modell<strong>in</strong>g categorical data<br />

In this chapter we consider some of the properties of generalized l<strong>in</strong>ear<br />

models and illustrate their application with examples. Readers <strong>in</strong>terested <strong>in</strong> a<br />

more detailed exposition are referred to the books by McCullagh and Nelder<br />

(1989) and Dobson (1990). Cox and Snell (1989), Hosmer and Lemeshow (1989)<br />

and Kle<strong>in</strong>baum (1994) give details of logistic regression.<br />

General theory<br />

Consider a random variable y distributed accord<strong>in</strong>g to the probability density<br />

f …y; m†, where m is the expected value of y; that is,<br />

E…y† ˆm: …14:1†<br />

Suppose that for each observation y there is a set of explanatory variables<br />

x1, x2, ..., xp, and that m depends on the values of these variables. Suppose<br />

further that, after some transformation of m, g…m†, the relationship is l<strong>in</strong>ear;<br />

that is,<br />

Z ˆ g…m† ˆb 0 ‡ b 1x 1 ‡ b 2x 2 ‡ ...‡ bpxp: …14:2†<br />

Then the relationship between y and the explanatory variables is a generalized<br />

l<strong>in</strong>ear model. The transformation, g…m†, is termed the l<strong>in</strong>k function, s<strong>in</strong>ce it<br />

provides the l<strong>in</strong>k between the l<strong>in</strong>ear part of the model, Z, and the random part<br />

represented by m. The l<strong>in</strong>ear function, Z, is termed the l<strong>in</strong>ear predictor and the<br />

distribution of y, f …y; m†, is the error distribution.<br />

When the error distribution is normal, the classic l<strong>in</strong>ear regression model is<br />

E…y† ˆb 0 ‡ b 1x 1 ‡ b 2x 2 ‡ ...‡ bpxp:<br />

Thus from (14.1) and (14.2), Z ˆ m, and the l<strong>in</strong>k function is the identity,<br />

g…m† ˆm. Thus, the familiar multiple l<strong>in</strong>ear regression is a member of the family<br />

of generalized l<strong>in</strong>ear models.<br />

When the basic observations are proportions, then the basic form of random<br />

variation might be expected to be b<strong>in</strong>omial. Such data give rise to many of the<br />

difficulties described <strong>in</strong> §10.8. Regression curves are unlikely to be l<strong>in</strong>ear, because<br />

the scale of the proportion is limited by the values 0 and 1 and changes <strong>in</strong> any<br />

relevant explanatory variable at the extreme ends of its scale are unlikely to<br />

produce much change <strong>in</strong> the proportion. A sigmoid regression curve as <strong>in</strong><br />

Fig. 14.1(a) is, <strong>in</strong> fact, likely to be found. An appropriate transformation may<br />

convert this to a l<strong>in</strong>ear relationship. In the context of the reasons for transformations<br />

discussed <strong>in</strong> §10.8, this is a l<strong>in</strong>eariz<strong>in</strong>g transformation. The variance of an<br />

observed proportion depends on the expected proportion, as well as on the<br />

denom<strong>in</strong>ator of the fraction (see (3.16)). F<strong>in</strong>ally, the b<strong>in</strong>omial distribution of<br />

random error is likely to be skew <strong>in</strong> opposite directions as the proportion<br />

approaches 0 or 1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!