29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

F.10. BBC data set<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

10 0<br />

BBC<br />

CPU time (sec.)<br />

10 2<br />

CSPA<br />

EAC<br />

HGPA<br />

MCLA<br />

VMA<br />

BC<br />

CC<br />

PC<br />

SC<br />

Figure F.10: φ (NMI) vs CPU time mean ± 2-standard deviation regions of the soft consensus<br />

functions on the BBC data collection.<br />

CSPA EAC HGPA MCLA VMA BC CC PC SC<br />

CSPA ——— 0.0001 0.0001 0.0001 0.0001 0.0002 0.0012 0.0001 0.0001<br />

EAC 0.0001 ——— 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001<br />

HGPA 0.0001 0.0001 ——— × 0.0001 0.0001 0.0001 × ×<br />

MCLA 0.0001 0.0001 0.0001 ——— 0.0001 0.0001 0.0001 × ×<br />

VMA 0.0456 0.0001 0.0001 0.0001 ——— 0.0001 0.0001 0.0001 0.0001<br />

BC 0.004 0.0001 0.0001 0.0001 × ——— 0.0001 0.0002 0.0002<br />

CC 0.004 0.0001 0.0001 0.0001 × × ——— 0.0001 0.0001<br />

PC × 0.0001 0.0001 0.0001 × × × ——— ×<br />

SC × 0.0001 0.0001 0.0001 × 0.0279 0.0279 × ———<br />

Table F.10: Significance levels p corresponding to the pairwise comparison of soft consensus<br />

functions using a t-paired test on the BBC data set. The upper and lower triangular sections<br />

of the table correspond to the comparison in terms of CPU time and φ (NMI) , respectively.<br />

Statistically non-significant differences (p >0.05) are denoted by the symbol ×.<br />

F.10 BBC data set<br />

This section is devoted to the presentation of the results of the soft consensus clustering experiments<br />

conducted on the BBC data set. A qualitative description of them is provided by<br />

the φ (NMI) vs CPU time diagram of figure F.10, and the results of the statistical significance<br />

study of the differences between consensus functions is presented in table F.10.<br />

It can be observed that VMA is again the fastest consensus function. The confidence<br />

voting consensus functions (PC and SC) are, in statistical terms, as fast as MCLA and<br />

HGPA. The positional voting consensus functions (BC and CC) are slower than those, the<br />

former being also faster than CSPA, while the latter is slower than it.<br />

As regards the quality of the consensus clustering solutions obtained, CSPA, PC and SC<br />

yield the highest φ (NMI) scores, being equivalent from a statistical significance viewpoint.<br />

The BordaConsensus and CondorcetConsensus clustering combiners also deliver pretty good<br />

performances, together with the VMA consensus function, being notably better than MCLA<br />

(and far better than EAC and HGPA, which yield extremely poor consensus clustering<br />

384

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!