29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.2. Random hierarchical consensus architectures<br />

observed (figures 3.5(a) and 3.5(b)), it can be noticed that i) SERTRHCA is again a pretty<br />

accurate estimation of SRTRHCA, andii) for several consensus functions (in fact, all but<br />

EAC) there exists at least one RHCA variant that is more computationally efficient that<br />

flat consensus. In general terms, the difference between the running times of the fastest<br />

RHCA variant and flat consensus is small, although in the case of the MCLA consensus<br />

function, the execution of flat consensus (i.e. b = l = 570) is six times as costly as the<br />

fastest RHCA variant (the one with b = 20).<br />

Three main conclusions can be drawn at this point: firstly, increasing the size of the<br />

cluster ensemble makes hierarchical consensus architectures a computationally competitive<br />

alternative to flat consensus. Secondly, it is necessary to predict accurately which is the<br />

fastest RHCA variant (i.e. the specific value of the mini-ensembles size b) soastoobtain<br />

significant execution time savings. And thirdly, the computational optimality of a particular<br />

RHCA variant is local to the consensus function employed.<br />

As regards the estimated and real running times of the fully parallel RHCA implementation,<br />

depicted in figures 3.5(c) and 3.5(d), we can conclude that, again, PERTRHCA is a<br />

good indicator of the most computationally efficient RHCA variant. Furthermore, notice<br />

that the differences between the running times of flat consensus and the optimal RHCA are<br />

beyond one order of magnitude for most consensus functions, which highlights the interestingness<br />

of RHCA in computational terms, as well as the need for being able to predict<br />

which is the least time consuming consensus architecture.<br />

Diversity scenario |df A| =19<br />

The results corresponding to the third diversity scenario (i.e. cluster ensembles of size<br />

l = 1083 using |dfA| = 19 randomly chosen clustering algorithms) are presented in figure<br />

3.6. In this case, the mini-ensembles size sweep is b = {2, 3, 4, 5, 6, 8, 9, 26, 27, 541, 1083}.<br />

As regards the serial implementation of the RHCA –whose estimated and real running<br />

times are presented in figures 3.6(a) and 3.6(b), respectively–, a few observations must be<br />

made. Firstly, notice that the curves in figure 3.6(a) present a high degree of resemblance<br />

to the ones in figure 3.6(b), which indicates that SERTRHCA is a notably accurate predictor<br />

of SRTRHCA. Again, we would like to highlight the fact that our main interest is that<br />

the former is a good predictor of the location of the minima of the latter, a goal which is<br />

pretty successfully achieved in this case. Secondly, notice the influence of the consensus<br />

function employed for conducting the clustering combination on the running time of the<br />

RHCA. Whereas most of them yield a similar running time pattern (i.e. they have a<br />

more or less pronounced minimum around b =26orb = 27), two consensus functions<br />

stand out for their particular behaviour: i) when the EAC consensus function is employed,<br />

flat consensus is faster than any serial RHCA variant, and ii) when consensus is created<br />

by means of the MCLA consensus function, the space complexity requirements of MCLA<br />

make flat consensus not executable, as this is the only consensus function (among the ones<br />

employed in this work) whose complexity scales quadratically with the cluster ensemble size<br />

(see appendix A.5).<br />

If the estimated and real parallel RHCA implementation running times are evaluated<br />

(see figures 3.6(c) and 3.6(d)), it can be observed that, whatever the consensus function<br />

employed, there always exists at least one parallel RHCA variant which performs more<br />

efficiently than flat consensus. Moreover, notice that just like in the previous diversity<br />

60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!