29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Publisher: Springer<br />

Series: Lecture Notes in Computer Science<br />

Volume: 4666<br />

Chapter 4. Self-refining consensus architectures<br />

Editors: Mike E. Davies, Christopher J. James, Samer A. Abdallah and Mark D.<br />

Plumbley<br />

Pages: 794-801<br />

Year: 2007<br />

Abstract: Deriving a thematically meaningful partition of an unlabeled document<br />

corpus is a challenging task. In this context, the use of document representations<br />

based on latent thematic generative models can lead to improved clustering. However,<br />

determining apriorithe optimal document indexing technique is not straighforward,<br />

as it depends on the clustering problem faced and the partitioning strategy adopted.<br />

So as to overcome this indeterminacy, we propose deriving a single consensus labeling<br />

upon the results of clustering processes executed on several document representations.<br />

Experiments conducted on subsets of two standard text corpora evaluate distinct<br />

clustering strategies based on latent thematic spaces and highlight the usefulness<br />

of consensus clustering to overcome the indeterminacy regarding optimal document<br />

indexing.<br />

131

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!