01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

This discussion is rather more mathematical than many other parts of the book,<br />

although it is hoped that appreciation of the more detailed parts of the argument<br />

is not necessary to ga<strong>in</strong> a general understand<strong>in</strong>g of this topic.<br />

It is most convenient for the follow<strong>in</strong>g discussion to assume that the data have<br />

been written <strong>in</strong> the succ<strong>in</strong>ct form of (12.39), the longitud<strong>in</strong>al aspect of the data<br />

be<strong>in</strong>g represented by the block diagonal nature of the dispersion matrix of the<br />

residual term. The true dispersion matrix which is, of course, unknown, is<br />

denoted by VT. If the data are analysed by ord<strong>in</strong>ary least squares, that is, the<br />

longitud<strong>in</strong>al aspect is ignored, then the estimate of the parameters of <strong>in</strong>terest, b,is<br />

^b O ˆ…X T X† 1 X T y, …12:48†<br />

where the subscript O <strong>in</strong>dicates that the estimator uses ord<strong>in</strong>ary least squares.<br />

Despite the implicit misspecification of the dispersion matrix, this estimator is<br />

unbiased and its variance is<br />

…X T X† 1 X T VTX…X T X† 1 : …12:49†<br />

If ord<strong>in</strong>ary least squares were valid and VT ˆ s 2 I, then (12.49) would reduce to<br />

the familiar formula (11.51). However, <strong>in</strong> the general case, application of (11.51)<br />

would be <strong>in</strong>correct, but (12.49) could be used directly if a suitable estimator of<br />

VT were available. Provided that the mean, Xb, is correctly specified, then a<br />

suitable estimator of the ith block with<strong>in</strong> VT is simply:<br />

…y i Xi ^ b O†…y i Xi ^ b O† T , …12:50†<br />

and, if these N estimators are collected together <strong>in</strong> ^ VT and this is used <strong>in</strong> place of<br />

VT <strong>in</strong> (12.49), then a valid estimator of the variance of ^ b O is obta<strong>in</strong>ed. The<br />

estimator is valid because, regardless of the true dispersion matrix, the variance<br />

of y is correctly estimated by the collection of matrices <strong>in</strong> (12.50). This estimator<br />

of variance is occasionally called a robust estimator because it is valid <strong>in</strong>dependently<br />

of model assumptions; the estimator of the variance of ^ b O is referred to as a<br />

`sandwich estimator' because the estimator of the variance of y is `sandwiched'<br />

between other matrices <strong>in</strong> (12.49) (see Royall, 1986).<br />

Although us<strong>in</strong>g (12.49) <strong>in</strong> conjunction with (12.50) yields a valid estimate of<br />

error, the estimates may be <strong>in</strong>efficient <strong>in</strong> the sense that the variances of elements<br />

of ^ b O are larger than they would have been us<strong>in</strong>g an analysis which used the<br />

correct variance matrix. The unbiasedness of (12.48) does not depend on us<strong>in</strong>g<br />

ord<strong>in</strong>ary least squares and an alternative would be to use<br />

^b G ˆ…X T V 1<br />

W X† 1 X T V 1<br />

W y, …12:51†<br />

with the associated estimator of variance be<strong>in</strong>g<br />

…X T V 1<br />

W X† 1 X T V 1 1<br />

W VTV W X…XT V 1<br />

W X† 1 :<br />

12.6 Longitud<strong>in</strong>al data 441

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!