29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Appendix C. Experiments on hierarchical consensus architectures<br />

behaviour, as flat consensus is the fastest option regardless of the diversity scenario. That is,<br />

as already observed in sections C.2.5 and C.3.5, the time complexity behaviour of consensus<br />

architectures is local to the consensus function employed for combining the clusterings.<br />

As regards the computational complexity of the parallel implementation of HCA (see<br />

figure C.33), it can be observed that these become much faster than flat consensus as soon<br />

as the size of the cluster ensemble is increased. As in the previous data collection, the<br />

running times of parallel RHCA and DHCA are pretty similar.<br />

Consensus quality comparison<br />

As far as the quality of the consensus clustering solutions obtained by the distinct consensus<br />

architectures, figure C.34 depicts the corresponding φ (NMI) boxplots. Again, performances<br />

are highly local to the consensus function employed: in this case, those consensus architectures<br />

based on the EAC, HGPA and SLSAD consensus functions give rise to the lowest<br />

quality consensus clusterings. If the three consensus architectures are compared, it can<br />

be observed that RHCA and flat consensus tend to perform quite similarly, while worse<br />

clustering solutions are generally obtained from DHCA. Notice that the highest robustness<br />

to clustering indeterminacies (i.e. consensus clustering solutions of comparable quality to<br />

the cluster ensemble components of highest φ (NMI) ) are obtained from the RHCA and flat<br />

consensus architectures based on MCLA, ALSAD and KMSAD.<br />

C.4.3 Glass data set<br />

In this section, we present the running times and quality evaluation (by means of φ (NMI)<br />

values) of the consensus clustering processes implemented by means of the serial and parallel<br />

RHCA DHCA implementations and flat consensus on the Glass data collection. The cluster<br />

ensembles sizes corresponding to the four diversity scenarios in which our experiments are<br />

conducted are l =29, 290, 551 and 812.<br />

Running time comparison<br />

Figure C.35 presents the boxplot charts that represent the running times of the three implemented<br />

consensus architectures, considering the entirely serial implementation of the<br />

hierarhical ones. As in the previous data collections, flat consensus is the fastest option<br />

in the lowest diversity scenario, whereas hierarchical consensus architectures become more<br />

computationally efficient as soon as the size of the cluster ensemble increases —for all but<br />

the EAC consensus function. which again highlights the interest of structuring consensus<br />

processes in a hierarchical manner as a means for i) reducing their time complexity when<br />

they are to be conducted on large cluster ensembles, and ii) obtaining a consensus clustering<br />

solution when the execution of flat consensus becomes unfeasible (e.g. when the MCLA<br />

consensus function is employed in the highest diversity scenario).<br />

The computational complexity of the consensus architectures presents a very similar<br />

behaviour when the parallel implementation of hierarchical versions is studied (see figure<br />

C.36). In this case, though, the differences between the running times of flat and hierarchical<br />

consensus architectures is even larger.<br />

295

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!