29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4. Flat vs. hierarchical consensus<br />

tends to be a bit quicker to execute than DHCA in high diversity scenarios, although the<br />

fact that the mini-ensembles sizes employed in the fastest random architecture variants are<br />

usually larger than those employed in DHCA is penalized by consensus architectures based<br />

on the MCLA consensus function. And last, as already perceived in the serial case, RHCA<br />

tends to be slightly more efficient than DHCA when clustering combination is conducted by<br />

means of consensus functions based on treating hierarchical clustering similarity measures<br />

as data (i.e. ALSAD and SLSAD).<br />

To sum up, we can conclude that there exists a very important dependence between<br />

the computationally optimal type of consensus architecture, the size of the cluster ensemble<br />

upon which consensus is built and the consensus function employed. From a practical<br />

standpoint, in front of a specific consensus clustering problem (i.e. a cluster ensemble of a<br />

given size l and a particular computational resources configuration), the user should take<br />

into account how these factors interact at the time of deciding which type of consensus<br />

architecture is to be implemented. However, this decision should not only be made on<br />

computational efficiency grounds. In fact, it should also allow for the quality of the consensus<br />

clustering solution obtained, as the quick obtention of a poor consensus data grouping<br />

would be of little use in practice. For this reason, the next section evaluates the quality of<br />

the consensus label vectors output by the same consensus architectures that have just been<br />

analyzed in computational terms.<br />

3.4.2 Consensus quality comparison<br />

In this section, we evaluate the quality of the consensus clustering solutions yielded by the<br />

fastest DHCA and RHCA variants and flat consensus architectures, which constitutes an<br />

indicator of their suitability for conducting robust clustering. The experiments conducted<br />

to this end follow the design described next.<br />

Experimental design<br />

– What do we want to measure?<br />

i) The suitability of the allegedly fastest DHCA and RHCA variants and flat consensus<br />

for obtaining clustering results robust to the inherent indeterminacies of<br />

clustering.<br />

ii) A further goal of this section is to determine whether certain consensus architectures<br />

tend to outperform others as regards the quality of the consensus clusterings<br />

they obtain.<br />

– How do we measure it?<br />

i) We analyze the quality of the consensus clustering solutions obtained by these<br />

consensus architectures, comparing it with respect the individual clusterings contained<br />

in the cluster ensemble E upon which consensus is built. The more similar<br />

the qualities of the consensus clustering solution and the top quality cluster ensemble<br />

components, the higher robustness to the clustering indeterminacies is<br />

attained. As mentioned in section 1.2.2, in this work we evaluate clustering<br />

solutions by means of an external cluster validity index, i.e. we compare the consensus<br />

clustering solution embodied in the labeling vector λc with a predefined<br />

96

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!