29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 3. Hierarchical consensus architectures<br />

yield lower quality consensus clustering solutions when compared to the other consensus<br />

functions.<br />

Thus, in some sense, we face a further indeterminacy, this one referred to the consensus<br />

function to apply. However, this indeterminacy can be overcome by taking advantage of the<br />

capability of creating several consensus clustering solutions by means of multiple consensus<br />

functions in computationally optimal time, and subsequently, apply a supraconsensus<br />

function that allows selecting the highest quality consensus clustering solution in a fully<br />

unsupervised manner, as proposed in (Strehl and Ghosh, 2002).<br />

Besides this use, supraconsensus strategies constitute a basic ingredient of the consensus<br />

self-refining procedure presented in the next chapter, which is oriented to better the quality<br />

of consensus clustering solutions as a means for creating robust clustering systems upon<br />

consensus clustering processes.<br />

3.6 Related publications<br />

Our first approach to hierarchical consensus architectures dealt with deterministic HCA<br />

(Sevillano et al., 2007a), although it was solely focused on the analysis of the quality of<br />

the consensus clusterings obtained, not on its computational aspect. The details of this<br />

publication, presented as a poster at the ECIR 2007 conference held at Rome, are described<br />

next.<br />

Authors: Xavier Sevillano, Germán Cobo, Francesc Alías and Joan Claudi Socoró<br />

Title: A Hierarchical Consensus Architecture for Robust Document Clustering<br />

In: Proceedings of 29th European Conference on Information Retrieval (ECIR 2007)<br />

Publisher: Springer<br />

Series: Lecture Notes in Computer Science<br />

Volume: 4425<br />

Editors: Giambattista Amati, Claudio Carpineto and Giovanni Romano<br />

Pages: 741-744<br />

Year: 2007<br />

Abstract: A major problem encountered by text clustering practitioners is the difficulty<br />

of determining aprioriwhich is the optimal text representation and clustering<br />

technique for a given clustering problem. As a step towards building robust document<br />

partitioning systems, we present a strategy based on a hierarchical consensus clustering<br />

architecture that operates on a wide diversity of document representations and<br />

partitions. The conducted experiments show that the proposed method is capable of<br />

yielding a consensus clustering that is comparable to the best individual clustering<br />

available even in the presence of a large number of poor individual labelings, outperforming<br />

classic non-hierarchical consensus approaches in terms of performance and<br />

computational cost.<br />

107

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!