01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

290 Analys<strong>in</strong>g non-normal data<br />

1 It should lie between 1and‡1, tak<strong>in</strong>g the value ‡1 when the <strong>in</strong>dividuals<br />

are ranked <strong>in</strong> exactly the same order by x as by y, and<br />

reversed.<br />

1 when the order is<br />

2 For large samples <strong>in</strong> which the distribution of x is <strong>in</strong>dependent of y (and<br />

conversely), the value should be zero.<br />

A satisfactory rank correlation coefficient can be obta<strong>in</strong>ed from Kendall's S<br />

statistic, described <strong>in</strong> §10.3, with a generalization of the def<strong>in</strong>ition used there. The<br />

total number of pairs of <strong>in</strong>dividuals is 1<br />

2 n…n 1†. Let P be the number of pairs<br />

which are ranked <strong>in</strong> the same order by x and by y, and Q the number of pairs <strong>in</strong><br />

which the rank<strong>in</strong>gs are <strong>in</strong> the opposite order. Then<br />

S ˆ P Q:<br />

(The previous def<strong>in</strong>ition is a particular case of this <strong>in</strong> which one variable<br />

represents a dichotomy <strong>in</strong>to the two groups, with group X be<strong>in</strong>g ranked before<br />

group Y, and the other variable represents the measurement under test, both xi<br />

and yj as previously def<strong>in</strong>ed be<strong>in</strong>g values of this second variable.)<br />

The rank correlation coefficient, t, is now def<strong>in</strong>ed by<br />

t ˆ<br />

1<br />

2<br />

S<br />

: …10:11†<br />

n…n 1†<br />

It is fairly easy to see that for complete agreement of rank<strong>in</strong>gs t ˆ 1, and for<br />

complete reversal t ˆ 1. A significance test of the null hypothesis that the x<br />

rank<strong>in</strong>g is <strong>in</strong>dependent of the y rank<strong>in</strong>g can conveniently be done on S. The null<br />

expectation of S (as of t) is zero, and, <strong>in</strong> the absence of ties,<br />

If there are ties, (10.12) is modified to give<br />

var…S† ˆn…n 1†…2n ‡ 5†=18: …10:12†<br />

var…S† ˆ1 P<br />

18 ‰n…n 1†…2n ‡ 5† t…t 1†…2t ‡ 5† P<br />

u…u 1†…2u ‡ 5†Š<br />

t<br />

1<br />

‡<br />

9n…n 1†…n 2† ‰P t…t 1†…t 2†Š‰<br />

t<br />

P<br />

u…u 1†…u 2†Š<br />

u<br />

1<br />

‡<br />

2n…n 1† ‰P t…t 1†Š‰<br />

t<br />

P<br />

u…u 1†Š,<br />

u<br />

where the summations are over groups of ties, t be<strong>in</strong>g the number of tied<br />

<strong>in</strong>dividuals <strong>in</strong> a group of x values and u the number of a group of tied y<br />

values.<br />

For further discussion of rank correlation, <strong>in</strong>clud<strong>in</strong>g tables for the significance<br />

of S, see Kendall and Gibbons (1990).<br />

u

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!