29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 4. Self-refining consensus architectures<br />

Consensus Consensus function<br />

architecture CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

flat 90.4 63.6 70 77.25 80 76.9 90<br />

RHCA 85.7 83.2 97 94.2 73.4 76.5 82.4<br />

DHCA 91.1 78.5 98 88.5 89.8 88.1 67.8<br />

Table 4.2: Percentages of self-refining experiments in which one of the self-refined consensus<br />

clustering solutions is better than its non-refined counterpart, averaged across the twelve<br />

data collections.<br />

Consensus Consensus function<br />

architecture CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

flat 16.5 273.3 53.3 14.9 9.8 16.1 433.4<br />

RHCA 10.9 157.1 294779.8 200.8 14.1 15.1 205.6<br />

DHCA 24.5 66.4 152450.9 79.9 38.4 30.9 224.9<br />

Table 4.3: Relative φ (NMI) gain percentage between the top quality self-refined consensus<br />

clustering solutions with respect to its non-refined counterpart, averaged across the twelve<br />

data collections.<br />

dure performs pretty successfully, giving rise to at least one self-refined consensus clustering<br />

solution that improves the consensus clustering available prior to refining in an average 83%<br />

of the experiments conducted.<br />

Moreover, we have also computed the relative φ (NMI) percentage gain between the nonrefined<br />

and the top quality self-refined consensus clustering solution —considering only<br />

those experiments where self-refining yields a better clustering solution, i.e. the 83% of<br />

the total. The results presented in table 4.3, which again correspond to an average across<br />

all the data sets for each consensus architecture and consensus function, reveal that the<br />

proposed self-refining procedure performs in an overwhelmingly successful manner, giving<br />

rise to an average relative percentage φ (NMI) gain of 21386% across all the experiments<br />

conducted. This exceptionally large figure is due to the fact that, although seldom, extremely<br />

poor quality consensus clustering solutions are available prior to self-refining in some cases.<br />

In particular, this situation is found when the HGPA consensus function is employed for<br />

refining the consensus clustering solutions yielded by hierarchical consensus architectures<br />

on the WDBC and BBC data collections (see, for instance, figure D.10 in appendix D).<br />

Despite its exceptionality, this fact introduces a large bias on the averaged values of φ (NMI)<br />

gains. However, if this kind of artifact is ignored, relative φ (NMI) gains between 10% and<br />

430% are consistently obtained in all cases, which gives an idea of the suitability of the<br />

proposed self-refining procedure for bettering consensus clustering solutions.<br />

Besides comparing the top quality self-refined consensus clustering solution with its<br />

non-refined counterpart, we have also contrasted its quality with respect to the highest and<br />

median φ (NMI) components of the cluster ensemble E, referred to as BEC (best ensemble<br />

component) and MEC (median ensemble component), respectively. Using the quality of<br />

these two components as a reference, we have evaluated i) the percentage of experiments<br />

where the maximum φ (NMI) (either refined or non-refined) consensus clustering solution<br />

attains a higher quality to that of the BEC and MEC, and ii) the relative percentage<br />

φ (NMI) variation between them and the top quality consensus clustering solution. Again,<br />

117

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!