29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Appendix C. Experiments on hierarchical consensus architectures<br />

experiment is conducted.<br />

For starters, figure C.7 presents the estimated and real exectution times of several variants<br />

of the fully serial implementation of RHCA. If the estimated and real running times are<br />

compared, it can be observed that it is possible to accurately predict the real execution time<br />

of serial RHCA variants which, at the same time, allows the precise prediction of the most<br />

computationally efficient RHCA variant —the ultimate goal of the proposed methodology.<br />

Figure C.8 depicts the estimated and real running times of the parallel RHCA implementation<br />

across the sweep of values of b for all four diversity scenarios. In comparison to<br />

what is observed in other data sets, PERTRHCA is a better estimator of PRTRHCA in this<br />

case. Moreover, as the diversity of the cluster ensemble grows, the computational savings<br />

derived from employing the fastest RHCA instead of flat consensus are noteworthy (especially<br />

for the HGPA, CSPA, ALSAD and SLSAD consensus functions). <strong>La</strong>st, note that<br />

flat consensus is not executable in the three of the four diversity scenarios if consensus is<br />

obtained by means of MCLA, due to the large size of the mini-ensembles b.<br />

C.2.5 WDBC data set<br />

In this section, we present the results of estimating the execution times of the serial and<br />

parallel RHCA implementations on the WDBC data collection. According to the four<br />

diversity scenarios generated by employing |dfA| =1, 10, 19 and 28 clustering algorithms<br />

for generating the cluster ensembles, these contain l = 113, 1130, 2147 and 3164 individual<br />

partitions.<br />

The estimated and real running times corresponding to the serial implementation of<br />

RHCA are depicted in figure C.9. As observed in the remaining data collections, SERTRHCA<br />

is a fairly accurate estimator of SRTRHCA, which allows predicting the fastest consensus<br />

architecture with a high precision. Notice that, as already noted in other collections, RHCA<br />

becomes a competitive option as the size of the cluster ensemble grows, except when the EAC<br />

consensus functions is employed. Notice that, for the most diverse scenarios, all consensus<br />

architectures are highly costly (in terms of execution time), so being able to predict which<br />

is the fastest can lead to important computation savings.<br />

As regards the fully parallel implementation of RHCA, the estimated and real running<br />

times corresponding to the four aforementioned diversity scenarios are presented in figure<br />

C.10. Despite the estimation of the real execution time is not as accurate as in the serial<br />

case, PERTRHCA is a reasonable predictor of the fastest consensus architecture in most<br />

cases.<br />

C.2.6 Balance data set<br />

This section presents the estimated and real execution times of multiple variants of random<br />

hierarchical consensus architectures on the Balance data collection, both in its serial and<br />

parallel versions. The low cardinality of the dimensional diversity factor of this data set<br />

gives rise to relatively small cluster ensembles in the four diversity scenarios, which are<br />

equal to l =7, 70, 133 and 196 in this case.<br />

Firstly, figure C.11 depicts the estimated and real running times of the serial RHCA<br />

implementation in the four diversity scenarios. As already observed in the previous data<br />

261

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!