29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

Appendix F. Experiments on soft consensus clustering<br />

WDBC<br />

0<br />

0 2 4 6<br />

CPU time (sec.)<br />

CSPA<br />

EAC<br />

HGPA<br />

MCLA<br />

VMA<br />

BC<br />

CC<br />

PC<br />

SC<br />

Figure F.5: φ (NMI) vs CPU time mean ± 2-standard deviation regions of the soft consensus<br />

functions on the WDBC data collection.<br />

CSPA EAC HGPA MCLA VMA BC CC PC SC<br />

CSPA ——— 0.0001 0.0001 × 0.0001 0.0001 0.0002 × ×<br />

EAC 0.0001 ——— 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001<br />

HGPA 0.0001 0.0001 ——— 0.0026 0.0001 0.0001 0.0001 0.0267 0.0249<br />

MCLA 0.0001 0.0001 0.0001 ——— 0.0001 0.0002 0.0004 × ×<br />

VMA 0.0001 0.0001 0.0001 0.0057 ——— 0.0001 0.0001 0.0001 0.0001<br />

BC 0.0001 0.0001 0.0001 × × ——— × 0.0001 0.0001<br />

CC 0.0001 0.0001 0.0001 × × × ——— 0.0001 0.0001<br />

PC 0.0001 0.0001 0.0001 0.0025 × × × ——— ×<br />

SC 0.0001 0.0001 0.0001 0.0103 × × × × ———<br />

Table F.5: Significance levels p corresponding to the pairwise comparison of soft consensus<br />

functions using a t-paired test on the WDBC data set. The upper and lower triangular<br />

sections of the table correspond to the comparison in terms of CPU time and φ (NMI) ,<br />

respectively. Statistically non-significant differences (p >0.05) are denoted by the symbol<br />

×.<br />

F.6 Balance data set<br />

In this section, the performance of the soft consensus functions is compared through a set<br />

of consensus clustering experiments conducted on the Balance data set.<br />

Figure F.6 depicts the diagram that qualitatively compares the nine consensus functions<br />

in terms of CPU time required for their execution and the φ (NMI) of the consensus clustering<br />

solutions they yield.<br />

As regards the former aspect, we can observe that VMA, PC and SC are the most efficient<br />

consensus functions, and the differences between them, though small, are statistically<br />

significant according to the results of the t-paired tests presented in table F.6. Moreover,<br />

we can also observe that the BC and CC consensus functions achieve a mid-range time<br />

complexity, being slower than MCLA and HGPA, but faster than CSPA and EAC.<br />

<strong>La</strong>st, as far as the quality of the consensus clustering solutions is concerned, there is a<br />

high degree of equality between consensus functions. In fact, the differences between the top<br />

379

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!