29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 4. Self-refining consensus architectures<br />

Consensus Consensus function<br />

architecture CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

flat 67.7 41.7 62.1 25 75 75 66.7<br />

RHCA 72.3 46.4 69.8 81.9 82 83.3 66.6<br />

DHCA 83.3 33.3 69.2 88.2 83.3 83.1 66.7<br />

Table 4.6: Percentage of experiments in which the best (non-refined or self-refined) consensus<br />

clustering solution is better than the median cluster ensemble component, averaged<br />

across the twelve data collections.<br />

Consensus Consensus function<br />

architecture CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

flat 107.4 33.1 82.4 91.1 113.8 109.1 73.3<br />

RHCA 96.6 24 98.7 109.5 118.4 114.3 70.4<br />

DHCA 113.3 25.6 100.2 108.8 118.9 114.7 87.8<br />

Table 4.7: Relative percentage φ (NMI) gain between the best (non-refined or self-refined)<br />

consensus clustering solution and the median cluster ensemble component, averaged across<br />

the twelve data collections.<br />

relative percentage φ (NMI) increase, a notable 91% relative φ (NMI) gain is obtained in average<br />

—see table 4.7 for a detailed view across consensus functions and architectures. Again,<br />

the beneficial effect of self-refining becomes evident if this result is compared to the one<br />

obtained from the analysis of the non-refined consensus clustering solution as, in that case,<br />

the observed relative φ (NMI) gain was 59% (see table 3.13).<br />

Furthermore, we have also measured the ability of the self-refining procedure for uniformizing<br />

the quality of the consensus clustering solutions output by the distinct consensus<br />

architectures. So as to evaluate this issue, we have computed, for each individual experiment,<br />

the average variance of the φ (NMI) values of the non-refined consensus solutions<br />

yielded by the RHCA, DHCA and flat consensus architectures —the smaller the variance,<br />

the more similar φ (NMI) values. This procedure has been repeated for the top quality (either<br />

refined or non-refined) consensus clustering solutions obtained at each experiment. The results<br />

of this analysis are presented in table 4.8. Except for the EAC consensus function,<br />

we can observe a notable reduction in the variance between the φ (NMI) of the consensus<br />

solutions output by the three considered consensus architectures, keeping it below the hundredth<br />

threshold in most cases. In average global terms, variance is dramatically reduced<br />

by an approximate factor of 20, from 0.105 to 0.0056. For this reason, it can be conjectured<br />

that, besides bettering the quality of consensus clustering solutions as already reported, the<br />

proposed self-refining procedure also helps making the quality of the self-refined consensus<br />

clustering solution more independent from the consensus architecture employed —so that<br />

it can be selected following computational criteria solely.<br />

As a conclusion, it can be asserted that the proposed consensus self-refining procedure is<br />

reasonably successful, as, in general terms, it introduces a quality increase that makes selfrefined<br />

consensus clustering solutions closer to the best individual components available<br />

on the cluster ensemble, which would ultimately constitute the goal of robust clustering<br />

systems based on consensus clustering.<br />

It is of paramount importance to notice that, in the analysis of all the previous results,<br />

119

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!