29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

B.2. Clustering indeterminacies in multimodal data sets<br />

clustering count<br />

clustering count<br />

clustering count<br />

200<br />

100<br />

InternetAds Baseline<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(a) Baseline<br />

(multimodal)<br />

200<br />

100<br />

InternetAds Baseline M1<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(f) Baseline<br />

(object)<br />

200<br />

100<br />

InternetAds Baseline M2<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(k) Baseline<br />

(collateral)<br />

clustering count<br />

clustering count<br />

clustering count<br />

200<br />

100<br />

InternetAds PCA<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(b) PCA (multimodal)<br />

200<br />

100<br />

InternetAds PCA M1<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(g) PCA (object)<br />

200<br />

100<br />

InternetAds PCA M2<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(l) PCA (collateral)<br />

clustering count<br />

clustering count<br />

clustering count<br />

200<br />

100<br />

InternetAds ICA<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(c) ICA (multimodal)<br />

200<br />

100<br />

InternetAds ICA M1<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(h) ICA (object)<br />

200<br />

100<br />

InternetAds ICA M2<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(m) ICA (collateral)<br />

clustering count<br />

clustering count<br />

clustering count<br />

200<br />

100<br />

InternetAds NMF<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(d) NMF (multimodal)<br />

200<br />

100<br />

InternetAds NMF M1<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(i) NMF (object)<br />

200<br />

100<br />

InternetAds NMF M2<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(n) NMF (collateral)<br />

clustering count<br />

clustering count<br />

clustering count<br />

200<br />

100<br />

InternetAds RP<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(e) RP (multimodal)<br />

200<br />

100<br />

InternetAds RP M1<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(j) RP (object)<br />

200<br />

100<br />

InternetAds RP M2<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(o) RP (collateral)<br />

Figure B.15: Histograms of the φ (NMI) values on the InternetAds data set obtained on the<br />

following data representations.<br />

B.2.5 Summary<br />

Repeating the formula employed in section B.1, table B.4 presents the φ (NMI) values attained<br />

by the top clustering solution achieved by the best representative of each one of the five<br />

families of clustering algorithms employed in this work (i.e. agglo, bagglo, direct, graph,<br />

rb and rbr), referring the employed type of representation (baseline, PCA, ICA, NMF or<br />

RP) and modality (multimodal –MM–, mode #1 –M1– or mode #2 –M2–). The idea<br />

is to present a condensed view of the influence of the data representation and clustering<br />

algorithm selection indeterminacies.<br />

Notice the distinct ordering of the five families of clustering algorithms in every data<br />

set. A clear indicator of the algorithm selection indeterminacy, is the fact that the rbr type<br />

of algorithms yield the top clustering solution in three of the four data sets, while offering<br />

the poorest performance in the InternetAds collection.<br />

The indeterminacy regarding the use of multimodal or unimodal data representations<br />

also becomes evident, as in two of the data sets (Corel and IsoLetters) the multimodal<br />

246

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!