01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

480 Multivariate methods<br />

Allocation is then to the group with the highest score. The method is also<br />

equivalent to allocat<strong>in</strong>g an <strong>in</strong>dividual to the group with the nearest mean, <strong>in</strong><br />

the sense of generalized distance. Computer programs exist for the whole procedure.<br />

This likelihood ratio approach is more appropriate than that of canonical<br />

variates if the ma<strong>in</strong> purpose is to form an allocation rule. If the purpose is to ga<strong>in</strong><br />

<strong>in</strong>sight <strong>in</strong>to the way <strong>in</strong> which groups differ, us<strong>in</strong>g as few dimensions as possible,<br />

then canonical variates are the more appropriate method.<br />

Although the two methods do not usually give identical results, there is a<br />

close relationship between them. Us<strong>in</strong>g the first canonical variate gives the same<br />

discrim<strong>in</strong>ation as the likelihood method if the group means are coll<strong>in</strong>ear <strong>in</strong> pdimensional<br />

space. Thus, as noted earlier, the two methods are identical for<br />

k ˆ 2 because the two group means may be regarded as coll<strong>in</strong>ear. If, for example,<br />

k ˆ 3 and p > 1, there will be k 1 ˆ 2 canonical variates, W1 and W2, and<br />

two <strong>in</strong>dependent likelihood ratio discrim<strong>in</strong>ators, say Z1 ˆ L1 L2 and<br />

Z2 ˆ L2 L3. (Note that L3 L1 ˆ …Z1 ‡ Z2†.) If the observations are plotted<br />

(i) with W1 and W2 as axes, and (ii) with Z1 and Z2 as axes, it will be found that<br />

the scatter diagrams are essentially similar, differ<strong>in</strong>g only <strong>in</strong> the orientation and<br />

scal<strong>in</strong>g of the axes. Thus, the use of both canonical variates is equivalent to the<br />

likelihood approach. In general, us<strong>in</strong>g k 1 canonical variates gives the same<br />

discrim<strong>in</strong>ation as the likelihood approach. However, the idea beh<strong>in</strong>d us<strong>in</strong>g<br />

canonical variates is to reduce the dimensionality, and the only advantage of<br />

this approach is if it is possible to use just a few variates without any material<br />

loss <strong>in</strong> discrim<strong>in</strong>at<strong>in</strong>g ability. For further discussion of this po<strong>in</strong>t, see Marriott<br />

(1974).<br />

Scor<strong>in</strong>g systems, us<strong>in</strong>g canonical variates<br />

Suppose that a variable is measured on a p-po<strong>in</strong>t scale from 1 to p. For example,<br />

a patient's response to treatment may be graded <strong>in</strong> categories rang<strong>in</strong>g from<br />

`much worse' (scored 1) to `much better' (scored p). This sequence of equidistant<br />

<strong>in</strong>tegers may not be the best scale on which to analyse the data. The best method<br />

of scor<strong>in</strong>g will depend on the criterion we wish to optimize. Suppose, for<br />

example, that patients were classified <strong>in</strong>to k treatment groups. It might be<br />

reasonable to choose a system of scor<strong>in</strong>g which maximized differences between<br />

treatments as compared with those with<strong>in</strong> treatments. For this purpose consider<br />

a set of p dummy variables, x1, x2, ..., xp, such that, for a patient <strong>in</strong> category j,<br />

the value of xj is 1 and that of the other xis is 0. Now choose the first canonical<br />

variate of the xs, say,<br />

g ˆ l1x1 ‡ l2x2 ‡ ...‡ lpxp:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!