01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

392 Further regression models<br />

SS ˆ Pn<br />

‰ yi s…xi†Š 2<br />

iˆ1<br />

…12:9†<br />

over functions def<strong>in</strong>ed by a few parameters, such as s…x† ˆa ‡ bx, or a higher<br />

polynomial as <strong>in</strong> §12.1. An alternative approach to smooth<strong>in</strong>g starts from<br />

attempt<strong>in</strong>g to m<strong>in</strong>imize (12.9) for a general smooth function s(x) (i.e. a curve<br />

with a cont<strong>in</strong>uous slope). If the xis are dist<strong>in</strong>ct, it is clear that there will be many<br />

smooth functions that pass through all the po<strong>in</strong>ts, thereby giv<strong>in</strong>g SS ˆ 0. The<br />

nth-degree polynomial fitted by regression is an example of a function that<br />

<strong>in</strong>terpolates the data <strong>in</strong> this way. If the data po<strong>in</strong>ts are not dist<strong>in</strong>ct, then SS<br />

will be m<strong>in</strong>imized by any of the curves that <strong>in</strong>terpolate the means of the yis for<br />

each dist<strong>in</strong>ct xi. However, it is clear that such curves, while smooth, will generally<br />

have to oscillate rapidly <strong>in</strong> order to pass through each po<strong>in</strong>t. Such curves are of<br />

limited <strong>in</strong>terest to the statistician, not only because such oscillatory responses are<br />

<strong>in</strong>tr<strong>in</strong>sically implausible as a measure of a signal <strong>in</strong> the data but also because any<br />

measure of error <strong>in</strong> the data, which would naturally be estimated by a quantity<br />

proportional to the m<strong>in</strong>imized value of (12.9), is necessarily zero (or, for nondist<strong>in</strong>ct<br />

abscissae, a value unaffected by the fitted curve).<br />

As noted above, curves which <strong>in</strong>terpolate the data will generally have to<br />

oscillate rapidly, <strong>in</strong> other words, as x <strong>in</strong>creases, s…x† will have to change rapidly<br />

from slop<strong>in</strong>g upwards to slop<strong>in</strong>g downwards and back aga<strong>in</strong> many times. The<br />

rate at which the slope of a curve changes is the rate of change of the rate of<br />

change of the curve. The rate of change of the curve is the derivative of s…x†, so<br />

the rate of change of the rate of change is the derivative of this, or the second<br />

derivative of s…x†, which is denoted by s 00 …x†. Consequently, curves that oscillate<br />

rapidly will have large numerical values for s 00 …x†, whether positive or negative. If<br />

this is true for a large portion of the curve, R ‰s 00 …u†Š 2 du will also be large; functions<br />

for which this is true are often described as rough. It may help to note that<br />

<strong>in</strong> this area `smoothness' is often used <strong>in</strong> two subtly different ways and care is<br />

sometimes needed. Smoothness can refer to the slope of s…x† chang<strong>in</strong>g smoothly,<br />

that is, without any abrupt changes or discont<strong>in</strong>uities: this is a common mathematical<br />

use of the term. On the other hand, it can refer to functions that change<br />

smoothly <strong>in</strong> the sense of not oscillat<strong>in</strong>g rapidly: here smooth is be<strong>in</strong>g used <strong>in</strong> its<br />

colloquial sense of the opposite of rough, i.e. to describe a function with low<br />

R ‰s 00 …u†Š 2 du. Functions that are smooth <strong>in</strong> the mathematical sense can be very<br />

rough, and the sense <strong>in</strong> which smooth is be<strong>in</strong>g used often needs to be judged from<br />

the context.<br />

An approach to smooth<strong>in</strong>g that is <strong>in</strong>termediate between methods based on<br />

m<strong>in</strong>imiz<strong>in</strong>g (12.9), subject to s…x† be<strong>in</strong>g determ<strong>in</strong>ed by a few adjustable parameters<br />

and the locally based methods described above, is to comb<strong>in</strong>e the measures<br />

of fit to the data, SS, and of roughness. Suppose a function s…x† is chosen<br />

which m<strong>in</strong>imizes

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!