01.06.2013 Views

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

Statistical Methods in Medical Research 4ed

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

472 Multivariate methods<br />

In the discussion so far it has been assumed that it was known which<br />

variables should be <strong>in</strong>cluded <strong>in</strong> the discrim<strong>in</strong>ant function (13.4). Usually a set<br />

of possible discrim<strong>in</strong>atory variables is available, and it is required to f<strong>in</strong>d the<br />

m<strong>in</strong>imum number of such variables which contribute to the discrim<strong>in</strong>ation.<br />

Many computer programs <strong>in</strong>clude a stepwise procedure, similar to that <strong>in</strong> multiple<br />

regression (see p. 357), to achieve this.<br />

The likelihood rule was derived assum<strong>in</strong>g that the xs had a multivariate<br />

normal distribution. Although the method may prove adequate <strong>in</strong> some cases,<br />

when this assumption breaks down an alternative approach may be preferable.<br />

In particular, if the xs conta<strong>in</strong> categorical variables, then the use of logistic<br />

regression (see §14.2) would be appropriate. The data of Example 13.2 are<br />

analysed us<strong>in</strong>g this approach on p. 492.<br />

In most examples of discrim<strong>in</strong>ant analysis, there will be little po<strong>in</strong>t <strong>in</strong> test<strong>in</strong>g<br />

the null hypothesis that the two samples are drawn from identical populations.<br />

The important question is how the populations differ. There are, though, some<br />

studies, particularly those <strong>in</strong> which two treatments are compared by multivariate<br />

data, <strong>in</strong> which a prelim<strong>in</strong>ary significance test is useful <strong>in</strong> <strong>in</strong>dicat<strong>in</strong>g whether there<br />

is much po<strong>in</strong>t <strong>in</strong> further exploration of the data.<br />

Multiple test<strong>in</strong>g<br />

Before discuss<strong>in</strong>g multivariate significance tests, we consider the problem <strong>in</strong><br />

terms of multiple test<strong>in</strong>g. Suppose each of the p variables is tested by a univariate<br />

method, such as a two-sample t test. Then some variables may differ significantly<br />

between groups whilst others may not. A difficulty <strong>in</strong> <strong>in</strong>terpretation is that p tests<br />

have been carried out so that, even if the overall null hypothesis that none of the<br />

variables differ between groups is true, the probability of f<strong>in</strong>d<strong>in</strong>g at least one<br />

significant difference is higher than the nom<strong>in</strong>al significance level adopted. There<br />

are similarities with the multiple comparisons problem discussed <strong>in</strong> §8.4.<br />

The Bonferroni procedure is sometimes used to correct for this. If p <strong>in</strong>dependent<br />

comparisons are carried out at a significance level of a0 , then the probability<br />

that one or more are significant is 1 …1 a0 † p , and when a0 is small this is<br />

approximately pa0 . Therefore sett<strong>in</strong>g a0 ˆ a=p ensures that the probability of<br />

f<strong>in</strong>d<strong>in</strong>g one or more significant effects does not exceed a. One problem with this<br />

method is that it takes no account of the multivariate correlations and it is<br />

conservative <strong>in</strong> the usual multivariate situation when the comparisons are not<br />

<strong>in</strong>dependent. A modification due to Simes (1986a) is to order the p values such<br />

that P…1† P…2† ... P…p†. Then the null hypothesis is rejected at level a if any<br />

P…j† ja=p. This modification is more powerful than the Bonferroni procedure<br />

whilst still ensur<strong>in</strong>g that the overall significance level does not exceed a for<br />

positively correlated variables (Sarkar & Chang, 1997)Ðthat is, the test rema<strong>in</strong>s<br />

conservative.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!