29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C.4. Computationally optimal RHCA, DHCA and flat consensus comparison<br />

Running time comparison<br />

The miniNG data collection is one of those cases where the cardinality of the diversity factors<br />

employed for generating the cluster ensemble, besides the number of objects it contains,<br />

makes flat consensus non-executable (for all but the EAC consensus function) in those<br />

scenarios where the cluster ensemble size is relatively large. In this situation, hierarchical<br />

consensus architectures become a means for making consensus clustering feasible.<br />

As regards the serial implementation of RHCA and DHCA –figure C.50–, the former<br />

tends to be faster than the latter, except when the HGPA and MCLA consensus functions<br />

are employed. This inter-consensus architecture performance is also observed in the parallel<br />

implementation case, presented in figure C.51.<br />

Consensus quality comparison<br />

The analysis of the quality of the consensus clustering solutions output by the flat and<br />

hierarchical consensus architectures can be made based on the φ (NMI) boxplot charts depicted<br />

in figure C.52. A single remark as regards the perforance of the distinct consensus<br />

functions: notice that CSPA, ALSAD and KMSAD based consensus solutions are the best<br />

ones in quality terms. And last, the φ (NMI) values of the consensus clusterings output by<br />

the two hierarchical consensus architectures –the only ones able to operate across all the<br />

diversity scenarios– are pretty similar in most cases.<br />

C.4.9 Segmentation data set<br />

This section presents the comparison between flat consensus and the computationally optimal<br />

consensus architectures in terms of CPU execution time and normalized mutual information<br />

between the ground truth and the consensus clustering solution yielded by each one<br />

of them. On the Segmentation data collection, the cluster ensemble sizes corresponding to<br />

the four diversity scenarios are l =52, 520, 988 and 1456.<br />

Running time comparison<br />

Figure C.53 presents the execution times of the flat consensus architecture and the estimated<br />

computationally optimal serial random and deterministic hierarchical consensus<br />

architectures. In this case, flat consensus is faster than RHCA and DHCA regardless of<br />

the cluster ensemble size (in our range of observation), except when the HGPA and MCLA<br />

consensus functions are employed —in fact, MCLA-based flat consensus is unfeasible in the<br />

two largest diversity scenarios. Moreover, the relative speed comparison between RHCA<br />

and DHCA yields different results depending on the consensus function employed: RHCA<br />

is faster than DHCA if consensus is based on CSPA, EAC, ALSAD or SLSAD, while the<br />

opposite behaviour is observed when the HGPA, MCLA and KMSAD consensus functions<br />

are used.<br />

Pretty similar results are obtained when the running times of the fully parallel implementation<br />

of RHCA and DHCA are analyzed, as figure C.53 reveals. The main difference<br />

with respect to what has been just reported is the logical speed up of HCA, which makes<br />

them be faster than flat consensus in the highest diversity scenario.<br />

318

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!