29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.5. Related publications<br />

Year: 2006<br />

Abstract: The performance of document clustering systems depends on employing<br />

optimal text representations, which are not only difficult to determine beforehand,<br />

but also may vary from one clustering problem to another. As a first step towards<br />

building robust document clusterers, a strategy based on feature diversity and cluster<br />

ensembles is presented in this work. Experiments conducted on a binary clustering<br />

problem show that our method is robust to near-optimal model order selection and<br />

able to detect constructive interactions between different document representations in<br />

the test bed.<br />

A subsequent extension of this work was published at the Journal of the Spanish Society<br />

for Natural <strong>La</strong>nguage Processing (Procesamiento del Lenguaje Natural) after its presentation<br />

at the SEPLN 2006 conference held at Zaragoza.<br />

Authors: Xavier Sevillano, Germán Cobo, Francesc Alías and Joan Claudi Socoró<br />

Title: Robust Document Clustering by Exploiting Feature Diversity in Cluster Ensembles<br />

In: Journal of the Spanish Society for Natural <strong>La</strong>nguage Processing (Procesamiento<br />

del Lenguaje Natural)<br />

Volume: 37<br />

Pages: 169176<br />

Year: 2006<br />

Abstract: The performance of document clustering systems is conditioned by the use<br />

of optimal text representations, which are not only difficult to determine beforehand,<br />

but also may vary from one clustering problem to another. This work presents an<br />

approach based on feature diversity and cluster ensembles as a first step towards<br />

building document clustering systems that behave robustly across different clustering<br />

problems. Experiments conducted on three binary clustering problems of increasing<br />

difficulty show that the proposed method is i) robust to near-optimal model order<br />

selection, and ii) able to detect constructive interactions between different document<br />

representations, thus being capable of yielding consensus clusterings superior to any<br />

of the individual clusterings available.<br />

<strong>La</strong>st, a global analysis regarding clustering indeterminacies and how they can be overcome<br />

via cluster ensembles was presented at the ICA 2007 conference as an oral presentation.<br />

Authors: Xavier Sevillano, Germán Cobo, Francesc Alías and Joan Claudi Socoró<br />

Title: Text Clustering on <strong>La</strong>tent Thematic Spaces: Variants, Strenghts and Weaknesses<br />

In: Proceedings of 7th International Conference on Independent Component Analysis<br />

and Signal Separation<br />

130

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!