29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Appendix C. Experiments on hierarchical consensus architectures<br />

C.3 Estimation of the computationally optimal DHCA<br />

The methodology for selecting the most computationally efficient implementation variant<br />

of deterministic hierarchical consensus architectures presented in section 3.3 –consisting of<br />

estimating the running time of several DHCA variants differing in the order diversity factors<br />

are associated to the stages of the hierarchical consensus architecture, selecting the one that<br />

yields the minimum running time, which is the one to be truly executed– has been applied<br />

on the Iris, Wine, Glass, Ionosphere, WDBC, Balance and MFeat unimodal data sets (see<br />

appendix A.2.1 for a description of these collections).<br />

In these experiments, the fully serial and parallel implementations of DHCA variants<br />

have been considered across the four experimental diversity scenarios employed in this work<br />

—see appendix A.4. The objective of this experiment is twofold: firstly, we seek to verify<br />

whether the proposed strategy succeeds in predicting the most computationally efficient<br />

DHCA variant. And secondly, we intend to analyze the conditions under which deterministic<br />

hierarchical consensus architectures are computationally advantageous compared to flat<br />

consensus clustering. We have followed the experimental design outlined next.<br />

– What do we want to measure?<br />

i) The time complexity of deterministic hierarchical consensus architectures.<br />

ii) The ability of the proposed methodology for predicting the computationally optimal<br />

DHCA variant, in both the fully serial and parallel implementations.<br />

iii) The predictive power of the proposed methodology based on running time estimation<br />

vs the computational optimality criterion based on designing the DHCA<br />

according to a decreasing diversity factor cardinality order, in both the fully<br />

serial and parallel implementations.<br />

– How do we measure it?<br />

i) The time complexity of the implemented serial and parallel DHCA variants is<br />

measured in terms of the CPU time required for their execution —serial running<br />

time (SRTDHCA) and parallel running time (PRTDHCA).<br />

ii) The estimated running times of the same DHCA variants –serial estimated running<br />

time (SERTDHCA) and parallel estimated running time (PERTDHCA)– are<br />

computed by means of the proposed running time estimation methodology, which<br />

is based on the measured running time of c = 1 consensus clustering process. Predictions<br />

regarding the computationally optimal DHCA variant will be successful<br />

in case that both the real and estimated running times are minimized by the<br />

same DHCA variant, and the percentage of experiments in which prediction is<br />

successful is given as a measure of its performance. In order to measure the<br />

impact of incorrect predictions, we also measure the execution time differences<br />

(in both absolute and relative terms) between the truly and the allegedly fastest<br />

DHCA variants in the case prediction fails. This evaluation process is replicated<br />

for a range of values of c ∈ [1, 20], so as to measure the influence of this factor<br />

on the prediction accuracy of the proposed methodology.<br />

iii) Both computationally optimal DHCA variants prediction approaches are compared<br />

in terms of the percentage of experiments in which prediction is successful,<br />

271

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!