01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

polynomials can be used for generalized l<strong>in</strong>ear models (§4.1) and, <strong>in</strong>deed, wherever<br />

a l<strong>in</strong>ear predictor is used to summarize the <strong>in</strong>formation from covariates,<br />

such as a Cox regression (§17.8).<br />

Polynomials, even when they are extended to <strong>in</strong>clude fractional polynomials,<br />

have limitations as a source of smooth curves for the description and <strong>in</strong>terpretation<br />

of data. Their fitt<strong>in</strong>g is global rather than local, so changes <strong>in</strong> coefficient<br />

values to accommodate the response <strong>in</strong> one part of the data can have consequences<br />

(often unwanted) <strong>in</strong> other parts. Moreover, their explanatory power is<br />

low: applications where there is background theory to guide the choice of curve<br />

will rarely po<strong>in</strong>t to the use of polynomials. <strong>Methods</strong> for smooth<strong>in</strong>g that avoid<br />

many of the problems associated with polynomials are discussed <strong>in</strong> the next<br />

section and more general aspects are considered <strong>in</strong> §12.4.<br />

12.2 Smooth<strong>in</strong>g and non-parametric regression<br />

It is often useful to th<strong>in</strong>k of data as hav<strong>in</strong>g been generated by a process that<br />

could be represented by the equation:<br />

Data ˆ Signal ‡ Noise:<br />

The statistician's task is often to prise apart these two components of the data:<br />

quantification of the `Signal' may be of primary <strong>in</strong>terest but measurement of the<br />

amount of noise is usually a necessary adjunct <strong>in</strong> an analysis. Indeed, the usual<br />

multiple regression model (11.1) can be thought of <strong>in</strong> these terms, with `Signal'<br />

associated with x1b 1 ‡ x2b 2 ‡ ...‡ xpb p and `Noise' with the random term e.<br />

However, simple l<strong>in</strong>ear regression techniques, even with the extensions described<br />

<strong>in</strong> §12.1, are not always the most suitable method. Even if an approach based on<br />

§12.1 is ultimately adopted, it is useful to be able to employ a number of<br />

approaches to an analysis, to f<strong>in</strong>d the most appropriate method and to assess<br />

the robustness of any <strong>in</strong>ferences be<strong>in</strong>g made.<br />

There has been considerable progress <strong>in</strong> recent years <strong>in</strong> the theoretical and<br />

practical development of smooth<strong>in</strong>g techniques and much fuller treatments can<br />

be found <strong>in</strong>, for example, Hastie and Tibshirani (1990), Green and Silverman<br />

(1994) and Bowman and Azzal<strong>in</strong>i (1997).<br />

Mov<strong>in</strong>g averages, l<strong>in</strong>es and kernel smooth<strong>in</strong>g<br />

12.2 Smooth<strong>in</strong>g and non-parametric regression 387<br />

Suppose that data of the form …xi, yi†, i ˆ 1, ..., n have been observed and it is<br />

desired to produce some estimate of the signal, which might be thought of <strong>in</strong> this<br />

case to be the relationship between y and x. Suppose the signal at the ith data<br />

po<strong>in</strong>t is Si and the noise is ei, so<br />

yi ˆ Si ‡ ei,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!