01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

that the logarithm of the MPD does closely follow a normal distribution (MacKenzie,<br />

1983).<br />

Arithmetic mean,<br />

logged values<br />

Standard deviation,<br />

logged values<br />

10.8 Transformations 311<br />

Geometric mean,<br />

orig<strong>in</strong>al scale (J=cm 2 )<br />

Fairer sk<strong>in</strong> 0 225 0 414 1 68<br />

Darker sk<strong>in</strong> 0 268 0 478 1 85<br />

The antilogs of the arithmetic means of the logged values, i.e. the geometric means, are<br />

shown above and can be seen to be noticeably smaller than the arithmetic means <strong>in</strong> the<br />

previous table. The standard deviations on the log scale are similar and the null hypothesis<br />

that the population geometric means are equal can be tested us<strong>in</strong>g the usual two-sample t<br />

test (see §4.3). Apply<strong>in</strong>g the t test to the logged values gives t ˆ 0 47, P ˆ 0 64. The<br />

difference <strong>in</strong> the arithmetic means of the logged values is 0 043, and a 95% confidence<br />

<strong>in</strong>terval for the difference is … 0 224, 0 139†. Tak<strong>in</strong>g antilogs of the difference, and of<br />

the limits of the confidence <strong>in</strong>terval, gives 0 91 for the ratio of geometric means (confirmed<br />

by direct calculation because 1 68=1 85 ˆ 0 91) and the 95% confidence <strong>in</strong>terval for the<br />

ratio is (0 60, 1 38).<br />

Of course, a logarithmic transformation can only be applied to a positive<br />

number, and associated quantities, such as geometric means are only def<strong>in</strong>ed for<br />

positive variables. Even if the sample values are generally positive the application<br />

of a log transformation can be ruled out because of occasional zero values. Even<br />

if a value of zero is physiologically impossible recorded zeros can arise because<br />

actual levels are below limits of detection. There are no simple solutions to this<br />

difficulty. Ad hoc solutions, such as replac<strong>in</strong>g zero values by the limit of detection<br />

(or perhaps half the limit of detection) if the limit is known, may prove satisfactory.<br />

Add<strong>in</strong>g a constant k to each xi before tak<strong>in</strong>g logs is an alternative, although<br />

it can be difficult to decide on the value for k and the results of the analysis can<br />

be disconcert<strong>in</strong>gly sensitive to the choice of k. Treat<strong>in</strong>g k as a parameter to be<br />

estimated by the data is beset by theoretical problems.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!