29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

C.3. Estimation of the computationally optimal DHCA<br />

Figure C.20 depicts the estimated and real execution times of fully parallel DHCA variants<br />

and flat consensus. The patterns presented in both columns of this figure (estimated<br />

running times on the left column, real execution times on the right) reveal the same behaviour<br />

observed on the previous data sets. That is, all DHCA variants have comparable<br />

running times, which would make running time estimations unnecessary as far as the election<br />

of the fastest DHCA variant is concerned. However, this estimation is necessary to<br />

decide whether hierarchical consensus is faster that its flat alternative, which occurs in all<br />

the diversity scenarios but in the lowest diversity one.<br />

C.3.4 Ionosphere data set<br />

In this section, the results of estimating the execution times of DHCA are compared to their<br />

real counterparts across four diversity scenarios on the Ionosphere data collection. The cluster<br />

ensemble sizes corresponding to these diversity scenarios are l =97, 970, 1843 and 2716,<br />

respectively.<br />

Firstly, figure C.21 depicts the estimated and real execution times considering the fully<br />

serial implementation of deterministic hierarchical consensus architectures. In this case,<br />

SERTDHCA is a fairly good estimator of SRTDHCA, and it constitutes a good base for<br />

predicting the least time consuming consensus architecture. Notice that, when consensus<br />

clusterings are built by means of the MCLA consensus function, flat consensus execution<br />

becomes impossible (given the computational resources employed in our experiments, see<br />

appendix A.6), so its hierarchical counterpart becomes a feasible alternative. Moreover, if<br />

hierarchical consensus is implemented by means of the DHCA variant defined by an ordered<br />

list of diversity factors arranged in decreasing cardinality order, notable computation time<br />

savings can be obtained.<br />

And secondly, as far as the fully parallel DHCA implementation is concerned (see figure<br />

C.22), the following observations can be made: i) PERTDHCA is a pretty accurate estimator<br />

of PRTDHCA, ii) there are no significant differences between the running times of the<br />

distinct variants of DHCA, and iii) flat consensus is more computationally costly than its<br />

hierarchical counterpart in all but one of the diversity scenarios considered.<br />

C.3.5 WDBC data set<br />

In this section, let us analyze the results corresponding to the WDBC data collection. In<br />

this case, each diversity scenario corresponds to a cluster ensemble of size l = 113, 1130, 2147<br />

and 3164, respectively. In first place, the left and right columns of figure C.23 present the<br />

estimated and real running times of the variants of the serial implementation of the DHCA<br />

on this data set across the four diversity scenarios.<br />

It can be observed that the proposed methodology yields a pretty good estimation<br />

of the real running time of DHCA variants. This allows the user to make well-grounded<br />

decisions regarding the most efficient hierarchical consensus architectures. For this data set,<br />

flat consensus is the computationally optimal architecture except in the highest diversity<br />

scenario —except when the EAC consensus function is employed.<br />

In second place, figure C.24 depicts the results corresponding to the parallel implementation<br />

of DHCA. The same conclusions drawn for the previous data collections are also<br />

applicable in the WDBC data set. That is, running times are almost independent of the<br />

278

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!