29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.3. Voting based consensus functions<br />

The interpretation of the contents of S Λ1 ,Λ 2 is the same as in the crisp scenario, i.e. its<br />

(i,j)th element is proportional to the similarity between the ith cluster of Λ1 and the jth<br />

cluster of Λ2. Again, transforming S Λ1 ,Λ 2 into a cluster dissimilarity matrix is the final<br />

step before solving the weight bipartite matching problem using the Hungarian method<br />

implementation of (Buehren, 2008), thus obtaining the cluster correspondence vector π Λ1 ,Λ 2<br />

of equation (6.27) (notice that this is exactly the same permutation vector of equation (6.19),<br />

as the present toy example is the fuzzy version of the former).<br />

π Λ1 ,Λ 2 = [3 1 2] (6.27)<br />

Although the interpretation of the cluster correspondence vector is equivalent in both<br />

the hard and the soft clustering scenarios (i.e. the cluster that is given the number ‘1’<br />

identifier in Λ1 corresponds to the cluster number ‘3’ of Λ2, and so on), recall that, in the<br />

fuzzy case, cluster permutations are equivalent to row order rearrangements.<br />

Consequently, in order to obtain the cluster permuted version of the fuzzy partition<br />

Λ1, it is necessary to multiply the transpose of the cluster permutuation matrix PΛ1 ,Λ2 associated to the cluster correspondence vector πΛ1 ,Λ by the fuzzy partition Λ1 itself. As<br />

2<br />

a result, the cluster permuted soft clustering Λ πΛ1 ,Λ2 1 is obtained —see equation (6.28) for<br />

the pair of cluster aligned fuzzy clustering solutions of our toy example.<br />

Λ πΛ 1 ,Λ 2<br />

1<br />

⎛<br />

0.921 0.932 0.905 0.025 0.019 0.030 0.014 0.055<br />

⎞<br />

0.017<br />

= ⎝0.025<br />

0.042 0.038 0.006 0.005 0.011 0.976 0.929 0.972⎠<br />

0.054<br />

⎛<br />

0.932<br />

0.026<br />

0.921<br />

0.057<br />

0.019<br />

0.969<br />

0.030<br />

0.976<br />

0.014<br />

0.959<br />

0.025<br />

0.009<br />

0.057<br />

0.016<br />

0.017<br />

0.010<br />

⎞<br />

0.055<br />

Λ2 = ⎝0.042<br />

0.025 0.005 0.011 0.009 0.006 0.038 0.972 0.929⎠<br />

(6.28)<br />

0.026 0.054 0.976 0.959 0.976 0.969 0.905 0.010 0.016<br />

Given a cluster ensemble E containing a set of l soft clustering solutions, the cluster<br />

disambiguation process consists in, taking one of them as a reference, apply the Hungarian<br />

method sequentially on the remaining l − 1 clustering solutions (Topchy et al., 2004). As a<br />

result, a cluster aligned version of the cluster ensemble is obtained, and voting procedures<br />

can be readily applied on it.<br />

6.3.2 Voting strategies<br />

Once the correspondence between the k clusters of each one of the l soft clustering solutions<br />

compiled in the cluster ensemble E has been resolved and the corresponding cluster<br />

permutations have been applied, it is time to derive the consensus clustering solution upon<br />

E, a task we tackle by means of voting procedures. In this section, we describe four voting<br />

methods, which give rise to as many consensus functions.<br />

Before proceeding to their description, recall that the scalar elements of a soft cluster<br />

ensemble E are considered, from a voting standpoint, as the expression of the degree of<br />

preference of each voter (i.e. clusterer) for each candidate (cluster) in the present election<br />

(clusterization of an object). The result of the election (i.e. the consolidated clusterization<br />

of the object under consideration based upon the decisions of the l clusterers comprised<br />

176

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!