01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

284 Analys<strong>in</strong>g non-normal data<br />

High prote<strong>in</strong><br />

Low prote<strong>in</strong> 83 97 104 107 113 119 123 124 129 134 146 161<br />

70 13 27 34 37 43 49 53 54 59 64 76 91<br />

85 2 12 19 22 28 34 38 39 44 49 61 76<br />

94 11 3 10 13 19 25 29 30 35 40 52 67<br />

101 18 4 3 6 12 18 22 23 28 33 45 60<br />

107 24 10 3 0 6 12 16 17 22 27 39 54<br />

118 35 21 14 11 5 1 5 6 11 16 28 43<br />

132 49 35 28 25 19 13 9 8 3 2 14 29<br />

For values outside the range of Table A7 an approximation to the number of<br />

differences to be excluded from each end of the ordered set is given by the <strong>in</strong>teger<br />

part of<br />

1<br />

2 n1n2 z<br />

r<br />

n1n2…n1 ‡ n2 ‡ 1†<br />

12<br />

The method of estimation described above could be applied to the data of<br />

Example 10.3. The 95% confidence limits are calculated as 66 and 2, <strong>in</strong> the<br />

units of percentage change used <strong>in</strong> Table 10.2. However, the method is <strong>in</strong>appropriate<br />

here, s<strong>in</strong>ce the hypothesis of a constant displacement <strong>in</strong> the distributions<br />

is quite unrealistic <strong>in</strong> view of the bunch<strong>in</strong>g of observations at the lower<br />

bound of 100%. Other parameters need to be used to describe the difference<br />

between the groups. For <strong>in</strong>stance, one might report the difference <strong>in</strong> the proportions<br />

of observations at the lower bound, or less than 90%. An alternative<br />

approach, which is available for any application of the Wilcoxon/Mann±Whitney<br />

test, is to note that the statistic UXY =n1n2 is clearly an estimate of the<br />

probability that a randomly chosen value of x is less than a randomly chosen<br />

value of y. However, the variance of this statistic is difficult to evaluate s<strong>in</strong>ce it<br />

depends on the precise way <strong>in</strong> which the two distributions differ; it is important<br />

to realize that the usual b<strong>in</strong>omial variance for a proportion with n1n2 observations<br />

is wholly <strong>in</strong>appropriate here s<strong>in</strong>ce the n1n2 differences are not <strong>in</strong>dependent.<br />

Normal scores<br />

An alternative approach to the two-sample distribution-free problem is provided<br />

by the Fisher±Yates normal scores. Instead of us<strong>in</strong>g ranks, the observations are<br />

transformed to a different set of scores, which depend purely on the ranks <strong>in</strong> the<br />

comb<strong>in</strong>ed sample of size n. The score for the observation of rank number r is, <strong>in</strong><br />

fact, numerically equal to the mean value of the rth smallest observation <strong>in</strong> a<br />

sample of n from a standardized normal distribution, N(0, 1). The scores are<br />

tabulated for various sample sizes by Fisher and Yates (1963, Table XX).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!