29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4. Flat vs. hierarchical consensus<br />

of the consensus function employed, and that such differences are, in all cases, statistically<br />

significant. As far as the hierarchical consensus architectures are concerned, notice that the<br />

fastest DHCA variant (DRA) is more computationally efficient than its RHCA counterpart<br />

(which has s = 2 stages and mini-ensembles of size b = 28), except when consensus processes<br />

are conducted by means of the EAC and SLSAD consensus functions —in this case, statistically<br />

equivalent running times are attained by both HCA. If these results are contrasted<br />

with the predicted computationally optimal consensus architectures presented in tables 3.4<br />

and 3.10, a single prediction error is detected (flat consensus turns out to be faster than the<br />

DRA DHCA variant when consensus is conducted by MCLA), which reinforces the notion<br />

that the proposed computationally optimal consensus architecture prediction methodology<br />

performs pretty well.<br />

Figure 3.15(b) presents the running times of the fully parallel optimal hierarhical consensus<br />

architectures and flat consensus. As in the serial case, it can be noticed that flat<br />

consensus tends to be more efficient than RHCA and DHCA. The only exception occurs<br />

when consensus is conducted by means of the MCLA consensus function —which is due<br />

to the fact that it is the only combiner the computational complexity of which increases<br />

quadratically with the size of the cluster ensemble. <strong>La</strong>st but not least, it is to note that,<br />

as opposed to what was observed in the serial implementation, the fastest RHCA is less<br />

time consuming than the most efficient DHCA variant, and the running time differences<br />

between them are statistically significant. The reason for this lies in the fact that this<br />

specific RHCA variant has s = 2 stages and consensus is conducted on mini-ensembles of<br />

size b = 7, whereas the DHCA variant consists of three consensus stages, in one of which<br />

consensus are built on larger mini-ensembles of size |dfD| = 14, which is responsible for the<br />

higher computational cost of parallel DHCA in this case.<br />

Diversity scenario |df A| =10<br />

The results corresponding to the experiments conducted in the second diversity scenario<br />

(i.e. cluster ensembles generated by the compilation of the clusterings output by |dfA| =10<br />

randomly selected clustering algorithms, giving rise to cluster ensembles of size l = 570) are<br />

presented in figure 3.16. In particular, figure 3.16(a) depicts the execution time boxplots<br />

of the serial implementation of hierarchical consensus architectures. The first noticeable<br />

situation is that, in contrast to what was observed in the lowest diversity scenario, the<br />

computationally optimal RHCA variant (s =2andb =20orb = 285 depending on the<br />

consensus function employed) is faster than its DHCA counterpart (DAR). This is due to the<br />

fact that the rise of the algorithmic diversity factor (from |dfA| =1to|dfA| = 10) entails an<br />

increase of the computational cost of one of the DHCA stages that exceeds the increment<br />

of the complexity of the RHCA caused by the same factor. Meanwhile, regarding the<br />

computational efficiency of flat consensus, two opposed behaviours are observed depending<br />

on the consensus function employed: while being faster than any hierarchical architecture<br />

when consensus are built using the CSPA and EAC consensus functions, one-step consensus<br />

is slower when the remaining clustering combiners are employed. Moreover, the differences<br />

between the running times of these consensus architectures is statistically significant at the<br />

5% significance level in all cases.<br />

Figure 3.16(b) presents the results corresponding to the fully parallel implementation<br />

of consensus architectures. In this case, flat consensus is at least four times more computationally<br />

costly than the DHCA and RHCA variants. Moreover, the optimal RHCA<br />

90

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!