29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C.2. Estimation of the computationally optimal RHCA<br />

richer), which boosts the size of the cluster ensemble.<br />

Firstly, figure C.3 depicts the results corresponding to the fully serial RHCA implementation<br />

across the four diversity scenarios (estimated running times on the left column, real<br />

running times on the right). The first remarkable fact is that SERTRHCA is a pretty accurate<br />

predictor of SRTRHCA, which can be easily verified if the pair of subfigures presented on<br />

each row of figure C.3 are compared. Again, RHCA becomes more computationally attractive<br />

as the size of the cluster ensemble increases (except for the EAC consensus function).<br />

Moreover, among the distinct RHCA variants executed in each experiment, the greatest<br />

efficiency is achieved by the ones with 2 or 3 stages.<br />

The estimated and real execution times of the parallel implementation of RHCA are<br />

depicted in figure C.4. As already observed in the previous data sets, PERTRHCA is a<br />

modestly accurate estimator of PRTRHCA, although it is a fairly good predictor of the most<br />

computationally efficient consensus architecture. Notice that, in the most diverse scenario<br />

(|dfA| = 28), the least time consuming RHCA variant is nearly two orders of magnitude<br />

faster than flat consensus —thus, it can be argued that being able to predict which RHCA<br />

configuration requires the least computation time constitutes a significantly advantageous<br />

strategy compared to the traditional one-step approach to consensus clustering.<br />

C.2.3 Glass data set<br />

This section presents the results of estimating the execution times of the fully serial and<br />

parallel implementations of RHCA in the four diversity scenarios for the Glass data set,<br />

which give rise to cluster ensembles of sizes l =29, 290, 551 and 812 each.<br />

Firstly, figure C.5 depicts both the estimated and real running times of several serial<br />

RHCA variants. These results are quite comparable to those obtained in the previous<br />

data collections. That is, except for the EAC consensus function, the RHCA variants with<br />

s =2ands = 3 stages become the most computationally efficient as the size of the cluster<br />

ensemble increases. Moreover, in the most diverse scenario (|dfA| = 28) flat consensus is not<br />

executable if the MCLA consensus function is employed as the clustering combiner, whereas<br />

hierarchical consensus does provide a means for obtaining a consolidated clustering solution<br />

upon the same cluster ensemble using this consensus function. Furthermore, notice that the<br />

proposed methodology for estimating the running time of serial RHCA yields fairly reliable<br />

predictions of their real execution time.<br />

And secondly, the results corresponding to the parallel implementation of RHCA are<br />

presented in figure C.6. Again, it can be observed that the estimated running time of the<br />

parallel RHCA is an arguably accurate approximation of the real execution time. However,<br />

notice that this lack of accuracy is tolerable inasmuch as i) the location of the minima of<br />

PERTRHCA mostly coincides with the minima of PRTRHCA —which means that the fastest<br />

consensus architecture is successfully predicted, and ii) the selection of a computationally<br />

suboptimal RHCA variant involves a light penalization in terms of real execution time.<br />

C.2.4 Ionosphere data set<br />

This section describes the results of the minimum complexity RHCA variant selection based<br />

on running time estimation. In the case of the Ionosphere data collection, cluster ensembles<br />

of sizes l =97, 970, 1843 and 2716 correspond to the four diversity scenarios where this<br />

256

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!