29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Appendix F. Experiments on soft consensus clustering<br />

CSPA EAC HGPA MCLA VMA BC CC PC SC<br />

CSPA ——— 0.0002 0.0001 0.0345 0.0001 × 0.0001 0.0016 0.0017<br />

EAC 0.0001 ——— 0.0001 × 0.0001 × 0.0001 0.0001 0.0001<br />

HGPA 0.0001 0.0001 ——— 0.0002 0.0001 0.0001 0.0001 × ×<br />

MCLA × 0.0001 0.0001 ——— 0.0001 × 0.0001 0.002 0.002<br />

VMA 0.0001 0.0001 0.0001 0.0001 ——— 0.0001 0.0001 0.0001 0.0001<br />

BC 0.0233 0.0001 0.0001 0.0001 0.0025 ——— 0.0001 0.0037 0.0038<br />

CC 0.0247 0.0001 0.0001 0.0001 0.0022 × ——— 0.0001 0.0001<br />

PC 0.0001 0.0001 0.0001 0.0001 × 0.0092 0.0084 ——— ×<br />

SC 0.0001 0.0001 0.0001 0.0001 × 0.0064 0.0058 × ———<br />

Table F.3: Significance levels p corresponding to the pairwise comparison of soft consensus<br />

functions using a t-paired test on the Glass data set. The upper and lower triangular sections<br />

of the table correspond to the comparison in terms of CPU time and φ (NMI) , respectively.<br />

Statistically non-significant differences (p >0.05) are denoted by the symbol ×.<br />

F.4 Ionosphere data set<br />

In the following paragraphs, the results of the soft consensus clustering experiments conducted<br />

on the Ionosphere data collection are described.<br />

For starters, figure F.4 displays the φ (NMI) vs CPU time mean ± 2-standard deviation<br />

regions corresponding to the nine consensus functions compared in this experiment. It can<br />

be observed that pretty low quality consensus clustering solutions (φ (NMI) < 0.1) are yielded<br />

by all clustering combiners. The highest φ (NMI) scores are obtained by CSPA, BC and CC,<br />

whose performance is statistically significantly better than that of the other six consensus<br />

functions —see table F.4 for the results of the statistical significance analysis of the results<br />

of this experiment.<br />

As regards time complexity, it can be observed that VMA is the most computationally<br />

efficient option, closely followed by HGPA. The proposed PC and SC consensus functions<br />

are comparable to CSPA and MCLA in computational terms, while the positional voting<br />

based BC and CC consensus functions are, together with EAC, the most time consuming<br />

alternatives. The differences between these three groups are statistically significant, as it<br />

can be inferred from the data measurements presented in table F.4.<br />

F.5 WDBC data set<br />

This section describes the results of the soft consensus clustering experiments conducted on<br />

the WDBC data set.<br />

The φ (NMI) vs CPU time mean ± 2-standard deviation regions of the consensus functions<br />

are depicted in figure F.5. Once again, VMA is the most computationally efficient<br />

consensus function (which, as mentioned earlier, is due to the simultaneity of the cluster<br />

disambiguation and voting processes), closely followed by HGPA. However, the confidence<br />

voting based consensus functions (PC and SC) are pretty close to VMA in CPU time terms,<br />

being slightly faster than CSPA and MCLA. As already noticed in the previous experiments,<br />

positional voting makes the BC and CC consensus functions more computationally costly<br />

(in this case, CC is slightly faster than BC, due to the fact that the low number of clusters<br />

377

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!