29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 6. Voting based consensus functions for soft cluster ensembles<br />

Input: Soft cluster ensemble E containing l fuzzy clusterings Λi (∀i =1...l)<br />

Output: Condorcet voting matrix CE<br />

Data: k clusters, n objects<br />

Hungarian (E)<br />

for b =1...n do<br />

M = 0k×k for a =1...l do<br />

r = Rank (λab);<br />

for c =1...k do<br />

M (r(c), r(c +1÷ k)) = M (r(c), r(c +1÷ k)) + 1<br />

end<br />

end<br />

for c =1...k do<br />

CE (c, b) =Count M(c, 1 ÷ k) ≥ l<br />

<br />

2<br />

end<br />

end<br />

Algorithm 6.4: Symbolic description of the soft consensus function CondorcetConsensus.<br />

Hungarian and Rank are symbolic representations of the cluster disambiguation and cluster<br />

ordering procedures, respectively, while the vector λab represents the bth column of<br />

the ath cluster ensemble component Λa, r is a clusters ranking vector and 0 k×k represents<br />

a k × k zero matrix.<br />

values contained in λab are directly or inversely proportional to the strength of association<br />

between objects and clusters. In each election, the (i,j)th entry of the square<br />

matrix M (usually referred to as the Condorcet sum matrix) counts the number of<br />

times the ith cluster is preferred over the jth one. The Count procedure is used for<br />

counting the number of elements of the cth row of matrix M that are greater or equal<br />

than l<br />

2 , which means that at least half of the voters preferred one cluster over another.<br />

Resorting again to our toy example, equation (6.42) presents the Condorcet score<br />

matrix resulting from applying CondorcetConsensus on it.<br />

⎛<br />

2 2 2 1 1 1 2 1<br />

⎞<br />

1<br />

CE = ⎝1<br />

1 0 0 0 0 2 2 2⎠<br />

(6.42)<br />

1 1 2 2 2 2 2 0 0<br />

Again, dividing each column of matrix CE by its L1-norm transforms the Condorcet<br />

score matrix into a soft consensus clustering solution Λc, whose(i,j)th entry represents<br />

the probability that the jth object belongs to the ith cluster (see equation (6.43) for<br />

the fuzzy consensus clustering solution obtained by CondorcetConsensus on our toy<br />

example). And assigning each object to the cluster it is most strongly associated to<br />

–breaking ties randomly– yields the crisp consensus clustering of equation (6.44).<br />

⎛<br />

0.500 0.500 0.500 0.333 0.333 0.333 0.333 0.333<br />

⎞<br />

0.333<br />

Λc = ⎝0.250<br />

0.250 0 0 0 0 0.333 0.667 0.667⎠<br />

(6.43)<br />

0.250 0.250 0.500 0.667 0.667 0.667 0.333 0 0<br />

183

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!