29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4. Flat vs. hierarchical consensus<br />

3.4 Flat vs. hierarchical consensus<br />

In sections 3.2 and 3.3, two specific implementations of hierarchical consensus architectures<br />

have been proposed, alongside a methodology for determining aprioriwhich is the fastest<br />

(random or deterministic) HCA variant, and deciding whether it is computationally advantageous<br />

with respect to classic flat consensus. In this section, we present a direct twofold<br />

comparison between flat consensus and those DHCA and RHCA variants deemed as the<br />

most computationally efficient by the proposed running time estimation methodologies.<br />

Firstly, we compare them in terms of computational complexity. In fact, such comparison<br />

could be made upon the results presented in sections 3.2 and 3.3, but we think that a<br />

comparison considering only the allegedly best performing variants may simplify the process<br />

of drawing meaningful conclusions. And secondly, these least time consuming hierarhical<br />

consensus architecture variants will be compared with flat consensus in terms of the quality<br />

of the consensus clustering solutions they yield. By doing so, we intend to present a comprehensive<br />

picture of our hierarchical consensus architecture proposals in terms of the two<br />

main factors that condition robust clustering by consensus: time complexity and quality.<br />

3.4.1 Running time comparison<br />

This section compares the real execution times of the allegedly fastest DHCA and RHCA<br />

variants and flat consensus. The experiments conducted follow the design outlined next.<br />

Experimental design<br />

– What do we want to measure? The time complexity of the allegedly fastest<br />

DHCA and RHCA variants and flat consensus.<br />

– How do we measure it? We measure the CPU time required for the execution of<br />

the aforementioned consensus architectures.<br />

– How are the experiments designed? Such comparison entails the running times<br />

of ten independent runs of each one of the compared consensus architectures. So<br />

as to evaluate their computational efficiency under distinct experimental conditions,<br />

the consensus processes involved have been conducted by means of the seven consensus<br />

functions for hard cluster ensembles employed in this work —see appendix A.5.<br />

Moreover, experiments have been replicated on the four diversity scenarios described<br />

in appendix A.4 —recall that they differ in the algorithmic diversity factor, as a set of<br />

|dfA| = {1, 10, 19, 28} randomly chosen clustering algorithms are employed for creating<br />

the cluster ensemble in each diversity scenario.<br />

– How are results presented? In formal terms, the measured execution times are<br />

presented by means of boxplot charts, so as to provide the reader with a notion<br />

of the degree of dispersion and asymmetry of the running times of each consensus<br />

architecture. When comparing boxplots, notice that non-overlapping boxes notches<br />

indicate that the medians of the compared running times differ at the 5% significance<br />

level, which allows a quick inference of the statistical significance of the results.<br />

– Which data sets are employed? A detailed description of the results of this<br />

comparison on the Zoo data collection is presented in the following paragraphs. Recall<br />

88

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!