29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

CSPA<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

EAC<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

HGPA<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

Chapter 3. Hierarchical consensus architectures<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

MCLA<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

ALSAD<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

KMSAD<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

φ (NMI)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

SLSAD<br />

E<br />

RHCA<br />

DHCA<br />

flat<br />

Figure 3.24: φ (NMI) of the consensus solutions yielded by the computationally optimal<br />

RHCA, DHCA and flat consensus architectures on the Zoo data collection for the diversity<br />

scenario corresponding to a cluster ensemble of size l = 1596.<br />

given consensus architecture, the HGPA, MCLA and KMSAD consensus functions have the<br />

highest quality variability, due to the existence of some random underlying process in their<br />

consensus generation procedures (e.g. the random initialization of k-means in the KM-<br />

SAD consensus function). In contrast, the qualities of the consensus clusterings output by<br />

those consensus architectures based on CSPA, EAC, ALSAD and SLSAD show very small<br />

(or even null) variations. As regards the inter-consensus architecture quality divergences,<br />

those based on the HGPA and ALSAD consensus functions show the most disparate results,<br />

whereas in other cases (e.g. CSPA, EAC, MCLA or KMSAD) statistically equivalent qualities<br />

are yielded by the three consensus architectures. And last, as far as the robustness of<br />

the consensus clustering solutions is concerned, notice that the EAC, ALSAD, KMSAD and<br />

SLSAD based consensus architectures yield the highest quality clustering results, getting<br />

pretty close to the top-quality component of the cluster ensemble E, being,inmostcases,<br />

better than the 75% of the clusterings contained in it.<br />

Comparison across diversity scenarios and data collections<br />

So as to provide the reader with a global comparative view of the consensus architectures in<br />

terms of the quality of the consensus clustering solutions they yield, we have compiled the<br />

φ (NMI) values obtained across all the experiments conducted on the twelve unimodal data<br />

collections in each diversity scenario, representing them in the boxplots depicted in figure<br />

3.25. Recall that, when comparing boxplots, non-overlapping boxes notches indicate that<br />

the medians of the compared magnitudes differ at the 5% significance level.<br />

A twofold qualitative analysis can be made in view of these results. The first aspect of<br />

study is an intra-consensus function comparison among consensus architectures. A quick<br />

inspection of any of the rows of figure 3.25 reveals that the optimality of consensus architectures<br />

is a property that is local to the consensus function applied. When the clustering<br />

combination process is based on the CSPA consensus function, the three consensus architectures<br />

yield pretty similar quality consensus solutions (as the boxes have a notable overlap),<br />

although DHCA tends to attain slightly higher φ (NMI) values —a similar pattern is observed<br />

in the boxplots presented in the column corresponding to the EAC consensus function. In<br />

contrast, flat consensus architectures yield higher quality consensus than their hierarchical<br />

counterparts when they are based on the HGPA clustering combiner. The analysis of the<br />

101

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!