25.12.2013 Views

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.4 Conclusion<br />

In this chapter, the functionality of the constructed multivariate analysis pipeline was<br />

once more extended to incorporate data integration techniques. Various approaches<br />

for the fusion of data from the analytical instruments have been evaluated in order to<br />

determine whether different instruments provide complementary information, which<br />

when brought together in an integrated analysis can provide a more reliable method<br />

for spoilage than single instruments. Generalised Procrustes Analysis (GPA) was the<br />

first data fusion technique to be investigated. The algorithm attempts to minimise the<br />

dissimilarities of the heterogeneous data by applying geometric transformations and<br />

simultaneous shape superimposition towards building a single consensus<br />

configuration. The alternative technique of consensus PCA (CPCA) was also<br />

implemented within the pipeline. The algorithm generates as an output a consensus<br />

scores matrix on the “super level”.<br />

Prior to permutation testing, no optimal classification method could be determined<br />

since the classification results of all different types of classifiers appeared to be<br />

equally good. However, in addition to verifying the statistical significance of the<br />

obtained results, the outcome of permutation testing clearly established SVMs as<br />

more powerful and robust techniques than PLS-DA since they consistently produced<br />

higher generalisation accuracies.<br />

The results obtained by GPA and CPCA were found to be greatly similar to each<br />

other, with the latter taking precedence in the weak cases presented by Procrustes.<br />

The values obtained by the fused models were compared to the analysis results<br />

of the standalone datasets as presented in Chapter 2. For case study 1, HPLC as a<br />

standalone technique produced the best overall , equal to 80%; in this instance,<br />

the results of both data integration techniques did not accomplish any improvement in<br />

the overall classification accuracy since they did not exceed the accuracy of<br />

standalone HPLC. However, these findings may only hold true for this particular case<br />

study, and thus the pipeline will be further applied on new real-world case studies as a<br />

means of establishing its generalisability.<br />

111

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!