29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.4. Experiments<br />

λc = 1 1 1 3 3 3 1 2 2 <br />

(6.44)<br />

Notice that the fuzzy consensus clusterings Λc output by BordaConsensus and CondorcetConsensus<br />

(equations (6.40) and (6.43)) differ notably from those obtained by<br />

SumConsensus and ProductConsensus (equations (6.32) and (6.36)) —see the double<br />

and triple ties obtained at the clusterization of the third and seventh objects, which<br />

are due to the intrinsic differences between the distinct voting strategies applied.<br />

Moreover, notice that the two positional voting based consensus functions (BC and<br />

CC) yield structuraly similar fuzzy consensus clusterings Λc, although their contents<br />

differ slightly. However, their hardened versions λc (equations (6.41) and (6.44)) differ<br />

in a larger extent, due to the random way ties are broken.<br />

6.4 Experiments<br />

This section presents several consensus clustering experiments evaluating the consensus<br />

functions for soft cluster ensembles proposed in the previous section. These experiments<br />

are conducted according to the following design.<br />

– What do we want to measure? We are interested in comparing both in the quality<br />

of the consensus clustering solutions obtained and the time complexity of the proposed<br />

consensus functions.<br />

– How do we measure it? As regards the time complexity aspect, all consensus<br />

processes follow a flat architecture (i.e. one step consensus), and we measure the<br />

CPU time required for their execution, using the computational resources described<br />

in appendix A.6. As far as the evaluation of the quality of the consensus clustering<br />

results is concerned, despite the proposed consensus functions output fuzzy consensus<br />

clustering solutions, we have compared their hardened version with respect to the<br />

ground truth of each data set in terms of normalized mutual information φ (NMI) .The<br />

reason for this is twofold: firstly, a soft ground truth is not available for these data sets,<br />

so fuzzy consensus clusterings cannot be directly evaluated. And secondly, provided<br />

that the CSPA, HGPA, MCLA and EAC consensus functions output hard consensus<br />

clustering solutions, fair inter-consensus functions comparison requires converting the<br />

soft consensus clustering matrices Λc output by VMA, BC, CC, PC and SC to crisp<br />

consensus labelings λc —recall that this simply boils down to assigning each object<br />

to the cluster it is more strongly associated to.<br />

– How are the experiments designed? In each consensus clustering experiment we<br />

have applied our four voting-based consensus functions –SumConsensus (SC), ProductConsensus<br />

(PC), BordaConsensus (BC) and CondorcetConsensus (CC)–, besides<br />

the fuzzy versions of CSPA, EAC, HGPA and MCLA (see section 6.2) plus one of<br />

the pioneering soft consensus functions, namely VMA (Voting Merging Algorithm)<br />

(Dimitriadou, Weingessel, and Hornik, 2002) —see appendix A.5 for a description.<br />

Experiments have been conducted on the twelve unimodal data collections employed<br />

in this work, which are described in appendix A.2.1. As regards the creation of the<br />

soft cluster ensemble components, we have employed the fuzzy c-means and the kmeans<br />

clustering algorithms. Whereas the former is fuzzy by nature, the latter is not.<br />

184

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!