29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 3<br />

Hierarchical consensus<br />

architectures<br />

As outlined in section 1.5, our proposal for building robust multimedia clustering systems<br />

lies on the creation of consensus clustering solutions upon cluster ensembles. These ensembles<br />

are made up of a large number of individual clusterings resulting from the execution of<br />

multiple clustering algorithms on several unimodal and multimodal representations of the<br />

objects contained in the data set.<br />

Indeed, the massive crossing between clustering algorithms, object representations and<br />

data modalities is a simple and parallelizable manner of generating highly diverse heterogeneous<br />

cluster ensembles, entrusting the obtention of a meaningful combined clustering<br />

solution to the consensus clustering task.<br />

Given the unsupervised nature of the clustering problem, we think this is a pretty<br />

sensible way of proceeding so as to obtain clustering solutions robust to the influence of the<br />

clustering indeterminacies, as sticking to the use of a handful of clustering algorithms or<br />

object representations can lead to an involuntary and undesirable limitation as regards the<br />

quality and diversity of the cluster ensemble components.<br />

However, at the same time that this strategy allows the creation of rich cluster ensembles,<br />

it also introduces several drawbacks that affect the consensus clustering task:<br />

– the large number of individual clustering solutions contained in the cluster ensemble,<br />

resulting from the aforementioned combination of clustering algorithms, object representations<br />

and data modalities, often leads to a notable increase in the computational<br />

cost of the execution of the consensus function, which can even become prohibitive.<br />

– this same fact incides in the diversity and quality of the cluster ensemble components,<br />

and, while moderate diversity has been found to be beneficial as far as consensus<br />

clustering is concerned (Hadjitodorov, Kuncheva, and Todorova, 2006; Fern and Lin,<br />

2008), the existence of poor quality clustering solutions in the cluster ensemble may<br />

cause a detrimental effect on the quality of the consensus clustering solution.<br />

Allowing for these considerations, in this thesis we introduce the concept of self-refining<br />

hierarchical consensus architectures (SHCA), defined as a generic means for fighting against:<br />

45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!