29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 4. Self-refining consensus architectures<br />

Consensus function<br />

CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

54.5 28.2 10.3 69.6 81.8 82 65.4<br />

Table 4.12: Percentage of self-refining experiments in which one of the self-refined consensus<br />

clustering solutions is better than the selected cluster ensemble component reference λref,<br />

averaged across the twelve data collections.<br />

Consensus function<br />

CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

26.9 9.1 20.8 15.2 15.5 11.4 7.8<br />

Table 4.13: Relative φ (NMI) gain percentage between the top quality self-refined consensus<br />

clustering solutions with respect to the maximum φ (ANMI) cluster ensemble component,<br />

averaged across the twelve data collections.<br />

creation uses a previously derived consensus clustering solution (it was 83%). This is due to<br />

the fact that the cluster ensemble component selection usually results in using a reference<br />

clustering λref of higher quality than the consensus clustering solution λc.<br />

Secondly, in those experiments where self-refined consensus solutions are better than<br />

λref, we have measured the relative degree of improvement achieved (quantified in terms<br />

of relative percentage φ (NMI) increase). The results, presented in table 4.13, show notable<br />

quality improvements, averaging a 15.2% relative φ (NMI) gain across all data sets and consensus<br />

functions. These quality gains obtained are much smaller than those obtained on the<br />

self-refining experiments based on a previously derived consensus clustering solution (see<br />

section 4.2), again due to the superior quality of the reference clustering the self-refining<br />

procedure is based upon.<br />

Next, the maximum and median φ (NMI) components of the cluster ensemble E –referred<br />

to as BEC (best ensemble component) and MEC (median ensemble component), respectively–<br />

are compared to either the top quality self-refined consensus clustering solution or λref,<br />

(depending on which has the largest φ (NMI) with respect to the ground truth). As in the<br />

previous section, we have evaluated i) the percentage of experiments where the maximum<br />

φ (NMI) consensus clustering solution attains a higher quality to that of the BEC and MEC,<br />

and ii) the relative percentage φ (NMI) variation between them and the top quality consensus<br />

clustering solution. Once more, all the results presented correspond to an average across<br />

all the experiments conducted on the twelve unimodal data collections.<br />

On one hand, table 4.14 presents the aforementioned magnitudes referred to the best<br />

cluster ensemble component. In average, the highest quality clustering (either λref or one<br />

of the self-refined consensus solutions) is better that the BEC in a 14.1% of the conducted<br />

experiments, achieving an average relative percentage φ (NMI) gain of 1.6%. It is important<br />

to notice that these results are pretty similar to those obtained when self-refining is based<br />

on a previously derived consensus clustering solution (see section 4.2), as these percentages<br />

were equal to 10.6% and 1.8%, respectively.<br />

On the other hand, table 4.15 presents the results of the same experiment, but referred<br />

to the median ensemble component (or MEC). In this case, the selection and self-refining<br />

procedure yields clusterings better than the MEC in 98% of the occasions, attaining an aver-<br />

125

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!