29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.2. Related work on consensus functions<br />

dissociation matrix, and the consensus clustering solution is defined as the one minimizing<br />

the disagreements with respect to the individual partitions contained in the cluster ensemble.<br />

In particular, three consensus functions based on correlation clustering were presented<br />

in this work, which are briefly described next. Firstly, the AGGLOMERATIVE consensus function<br />

results from applying a standard bottom-up procedure for correlation clustering on<br />

cluster ensembles. Resorting to the graph view of the pairwise object distance matrix, the<br />

AGGLOMERATIVE algorithm follows an iterative merging process that joins objects in clusters<br />

depending on whether their average distance is below a predefined threshold, stopping when<br />

further cluster merging does not reduce the number of disagreements of the consensus solution<br />

with respect to the cluster ensemble. Secondly, the FURTHEST consensus function can<br />

be regarded as the converse of AGGLOMERATIVE , as it consists of a top-down procedure that<br />

iteratively separates maximally distant graph vertices into consensus clusters, assigning<br />

the remaining objects to the cluster that minimizes the overall number of disagreements.<br />

This process is stopped when no disagreement reduction is achieved from additional cluster<br />

splitting. And fifthly, the LOCALSEARCH algorithm is derived from the application of a local<br />

search correlation clustering heuristic, which is based on a greedy procedure that, starting<br />

with a specific (possibly random) partition of the graph, tries to minimize the number of<br />

disagreements resulting from moving objects to different clusters or creating new singleton<br />

clusters, stopping when no move can decrease the disagreements rate. Interestingly, the authors<br />

point out that, despite its high computational cost, the LOCALSEARCH algorithm can be<br />

employed as a post-processing step for refining a previously obtained consensus clustering<br />

solution.<br />

2.2.10 Consensus functions based on search techniques<br />

In (Goder and Filkov, 2008), two consensus functions based on search techniques were<br />

introduced. Their rationale consists of building the consensus clustering solution by means<br />

of a greedy search process aiming to minimize the cost function —the authors implement<br />

such search processes either by means of Simulated Annealing (SA), as in (Filkov and Skiena,<br />

2004), and on successive single object movements that guarantee the largest decrease of the<br />

cost function (Best One Element Moves or BOEM).<br />

2.2.11 Consensus functions based on cluster ensemble component selection<br />

Recall that the aim of any consensus clustering process is to obtain a single partition<br />

from a collection of l clustering solutions. As an alternative means for achieving that<br />

goal, cluster ensemble component selection techniques are based on obtaining the consensus<br />

clustering solution by selection, not by combination. For instance, the BESTCLUSTERING<br />

algorithm (Gionis, Mannila, and Tsaparas, 2007) is not a consensus function proper, as it<br />

identifies as the consensus clustering the individual partition that minimizes the number of<br />

disagreements with respect to the remaining clusterings in the cluster ensemble.<br />

Following a very similar approach, the Best of K (BOK) consensus function is based on<br />

selecting that individual clustering from the cluster ensemble that minimizes the number<br />

of pairwise co-clustering disagreements between the individual partitions in the cluster<br />

ensemble (Goder and Filkov, 2008).<br />

42

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!