CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.4 Conclusion<br />
In this chapter, the functionality of the constructed multivariate analysis pipeline was<br />
once more extended to incorporate data integration techniques. Various approaches<br />
for the fusion of data from the analytical instruments have been evaluated in order to<br />
determine whether different instruments provide complementary information, which<br />
when brought together in an integrated analysis can provide a more reliable method<br />
for spoilage than single instruments. Generalised Procrustes Analysis (GPA) was the<br />
first data fusion technique to be investigated. The algorithm attempts to minimise the<br />
dissimilarities of the heterogeneous data by applying geometric transformations and<br />
simultaneous shape superimposition towards building a single consensus<br />
configuration. The alternative technique of consensus PCA (CPCA) was also<br />
implemented within the pipeline. The algorithm generates as an output a consensus<br />
scores matrix on the “super level”.<br />
Prior to permutation testing, no optimal classification method could be determined<br />
since the classification results of all different types of classifiers appeared to be<br />
equally good. However, in addition to verifying the statistical significance of the<br />
obtained results, the outcome of permutation testing clearly established SVMs as<br />
more powerful and robust techniques than PLS-DA since they consistently produced<br />
higher generalisation accuracies.<br />
The results obtained by GPA and CPCA were found to be greatly similar to each<br />
other, with the latter taking precedence in the weak cases presented by Procrustes.<br />
The values obtained by the fused models were compared to the analysis results<br />
of the standalone datasets as presented in Chapter 2. For case study 1, HPLC as a<br />
standalone technique produced the best overall , equal to 80%; in this instance,<br />
the results of both data integration techniques did not accomplish any improvement in<br />
the overall classification accuracy since they did not exceed the accuracy of<br />
standalone HPLC. However, these findings may only hold true for this particular case<br />
study, and thus the pipeline will be further applied on new real-world case studies as a<br />
means of establishing its generalisability.<br />
111