25.12.2013 Views

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.3.3 Permutation Tests<br />

Even though thorough model validation and evaluation methods have been applied to<br />

ensure that the performance metrics are representative of real world application,<br />

“accuracy estimates are usually meaningless without a confidence interval” (Kohavi,<br />

1995; Brereton, 2006; Harrington, 2006).<br />

As a means of providing an indication of the statistical significance of the results,<br />

permutation tests were applied. The background of permutation testing was<br />

demonstrated in Sections 1.7 and 2.2.4. By randomising the data with respect to the<br />

sensory scores (classes), any prior association between the initial data and the classes<br />

is destroyed, while their initial distributional properties are preserved (Wu et al.,<br />

2002; Westerhuis et al., 2008). As permutation testing is performed repeatedly a large<br />

number of times, a reference distribution for the null hypothesis is obtained. The 95%<br />

confidence interval (C.I.), which is equal to two standard deviations from the mean, is<br />

calculated based on the distribution of permuted classification results. If the observed<br />

non-permuted value is higher than both 95% confidence bounds, then the initial result<br />

is indeed significant. Metrics such as the -value are also frequently reported in<br />

permutation testing; the -value is equal to the proportion of permuted values that are<br />

at least as good as the observed statistic (Hubert and Schultz, 1976).<br />

In the context of this work, each permutation constitutes a single classification<br />

ensemble, which consists of 100 individual classifiers; each of these classifiers<br />

includes 100 bootstrapping iterations for the purposes of hyperparameter<br />

optimisation. The permutation tests were executed a total of 100 times for each<br />

dataset under study, which results to a total of one million iterations per dataset.<br />

Under the null hypothesis, the original non-permuted value is considered another<br />

random case. Thus, only 99 actual permutations are indeed required, in addition to the<br />

observed value, leading to 100 permutations in total; for the specific number of<br />

iterations, the lowest possible -value will be equal to .<br />

Finally, all the permuted samples were drawn as an individual step prior to analysis to<br />

assure that the outcome of randomisation is not biased in any way.<br />

102

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!