29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.3. Multimedia clustering based on cluster ensembles<br />

cluster ensemble (a selection process that is given the name of supraconsensus (Strehl and<br />

Ghosh, 2002)).<br />

The analysis of the quality (measured in terms of normalized mutual information with<br />

respect to the ground truth) of the set of self-refined consensus clusterings obtained at each<br />

experiment reveals that the proposed self-refining procedure is notably successful, as it is<br />

higher than that of the reference clustering in a 83% (for consensus based self-refining) or a<br />

56% (for selection based self-refining) of the experiments conducted. Furthermore, we have<br />

also observed that producing multiple self-refined consensus clusterings is a highly beneficial<br />

approach, as the highest quality self-refined clustering is obtained for very disparate values<br />

of p depending on the experiment —from p=2% to p=90%, thus it would be pretty easy to<br />

select a suboptimal value of p if a single one was chosen. As far as the quality gains induced<br />

by the self-refining procedure are concerned, relative percentage φ (NMI) increases (referred<br />

to the non-refined consensus clustering) higher –and quite often much higher– than 10%<br />

are obtained in a vast majority of the experiments conducted.<br />

A further advantage of the self-refining procedure is its ability to uniformize the quality of<br />

the consensus clustering solutions created by distinct consensus architectures –reducing the<br />

variances between their φ (NMI) scores by a factor of 20–, thus making it easier to decide which<br />

is the most appropriate consensus architecture for a given consensus clustering problem on<br />

computational grounds solely.<br />

However, the good performance of the proposed self-refining procedure is somewhat<br />

tarnished by the limited accuracy of the supraconsensus selection process, which manages<br />

to select the highest quality self-refined consensus clustering in less than the half of the<br />

experiments conducted, which causes an average 14% relative φ (NMI) reduction between the<br />

consensus clustering selected by supraconsensus and the top quality one.<br />

For this reason, the main research activities in this area should be directed, in our<br />

opinion, towards the derivation of accurate supraconsensus selection techniques capable of<br />

choosing, in a fully blind manner and as precisely as possible, the highest quality consensus<br />

clustering among a given bunch of them.<br />

<strong>La</strong>st, we have pleasingly noticed that fighting the expectable quality decrease suffered<br />

by consensus clusterings created upon large cluster ensembles has also drawn the interest<br />

of other authors. Curiously enough, this issue has also been tackled in (Fern and Lin,<br />

2008) in a very similar fashion to our selection based self-refining procedure, which can be<br />

interpreted as a sign of the good sense of our proposals.<br />

7.3 Multimedia clustering based on cluster ensembles<br />

Undoubtedly, ‘going multimedia’ is a beneficial trend, as it provides a richer vision of<br />

information. However, it poses a challenge when multimodal data is to be processed by<br />

means of unsupervised learning techniques (e.g. clustering), as the existence of multiple<br />

modalities increases the uncertainties about what is the best way to represent, classify or<br />

describe the data. In this sense, intuition tends to suggest that constructive interactions<br />

between the distinct modalities exist, which should lead to a better explanation of the<br />

data. However, it is not clear how this modality fusion should be conducted, either at a<br />

feature level (early fusion) or at a decision level (late fusion). Indeed, our experiments have<br />

demonstrated that early fusion is not always advantageous as regards the quality of the<br />

198

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!