29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.3. Selection-based self-refining<br />

%of<br />

experiments<br />

relative %<br />

φ (NMI) gain<br />

Consensus function<br />

CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

9.1 9.1 0 16.7 36.4 27.6 0<br />

2.1 0.2 – 2.6 1.8 1.3 –<br />

Table 4.14: Percentage of experiments where either the top quality self-refined consensus<br />

clustering solution or λref better the best cluster ensemble component, and relative φ (NMI)<br />

gain percentage with respect to it, averaged across the twelve data collections.<br />

Consensus function<br />

CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

%of<br />

experiments<br />

100 95.4 95 100 100 100 95.4<br />

relative %<br />

φ<br />

118.7 100.7 118.3 114.9 116.1 112.5 107.4<br />

(NMI) gain<br />

Table 4.15: Percentage of experiments where either the top quality self-refined consensus<br />

clustering solution or λref better the median cluster ensemble component, and relative<br />

φ (NMI) gain percentage with respect to it, averaged across the twelve data collections.<br />

age relative φ (NMI) gain of 112.7%. These figures indicate that the selection-based consensus<br />

self-refining yields better results –when compared to the MEC– than its consensus-based<br />

counterpart, where the two aforementioned percentages reduced to 67.7% and 91%, respectively.<br />

As a summarization of this analysis of the selection-based consensus self-refining proposal,<br />

we can conclude that, firstly, it constitutes a fairly good approach as far as the<br />

obtention of a high quality partition of the data is concerned. When compared to the<br />

consensus-based self-refining procedure put forward in section 4.1, it can be observed that,<br />

while the relative quality gains introduced by the self-refining process itself are smaller in<br />

selection-based consensus self-refining, the top quality clustering results obtained are superior<br />

to those yielded by consensus-based self-refining. We believe that these phenomena<br />

are both due to the differences in the quality of the clustering solution that constitutes<br />

the starting point of the self-refining process —in the case of consensus-based self-refining,<br />

this reference is a previously derived consensus clustering λc, which typically is a poorer<br />

data partition than the maximum φ (ANMI) cluster ensemble component λref (see figures in<br />

appendices D.1 and D.2 for a quick visual comparison). This fact makes selection-based<br />

self-refining even a more attractive alternative, all the more since no previous consensus<br />

process execution is required —with the obvious computational savings this implies.<br />

4.3.2 Evaluation of the supraconsensus process<br />

As regards the performance of the supraconsensus function proposed by (Strehl and Ghosh,<br />

2002), we have conducted a twofold evaluation. On one hand, we have analyzed the percentage<br />

of experiments in which the highest quality and the clustering solution selected<br />

via supraconsensus coincide. On the other hand, we have measured the relative percentage<br />

φ (NMI) loss derived from a suboptimal consensus solution selection, using the top quality<br />

clustering solution as a reference (i.e. the one that should be selected by an ideal supracon-<br />

126

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!