29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

B.2. Clustering indeterminacies in multimodal data sets<br />

clustering count<br />

clustering count<br />

clustering count<br />

80<br />

60<br />

40<br />

20<br />

CAL500 Baseline<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(a) Baseline<br />

(multimodal)<br />

CAL500 Baseline M1<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(e) Baseline<br />

(audio)<br />

CAL500 Baseline M2<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(i) Baseline<br />

(text)<br />

clustering count<br />

clustering count<br />

clustering count<br />

CAL500 PCA<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(b) PCA (multimodal)<br />

CAL500 PCA M1<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(f) PCA (audio)<br />

CAL500 PCA M2<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(j) PCA (text)<br />

clustering count<br />

clustering count<br />

clustering count<br />

CAL500 ICA<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(c) ICA (multimodal)<br />

CAL500 ICA M1<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(g) ICA (audio)<br />

CAL500 ICA M2<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(k) ICA (text)<br />

clustering count<br />

clustering count<br />

clustering count<br />

CAL500 RP<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(d) RP (multimodal)<br />

CAL500 RP M1<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(h) RP (audio)<br />

CAL500 RP M2<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 0.5<br />

φ<br />

1<br />

(NMI)<br />

(l) RP (text)<br />

Figure B.13: Histograms of the φ (NMI) values on the CAL500 data set obtained on the<br />

following data representations.<br />

which are far better than those obtained on the text modality (always below φ (NMI) =0.3).<br />

<strong>La</strong>st but not least, the multimodal object representation seems to benefit slightly from the<br />

early fusion of the visual and textual features of both modalities, as better clustering results<br />

are obtained in this case, although by a very small margin.<br />

B.2.3 InternetAds data set<br />

The clustering results corresponding to the InternetAds data collection are summarized in<br />

figure B.15. Many poor clustering results are obtained on this data set, as the high peaks<br />

located on the leftmost regions of the histograms reveal. The distinct data representations<br />

and modalities present a pretty erratic behaviour, as discussed next.<br />

If the two modalities are compared, the best clustering results are obtained, in general<br />

terms, using the collateral information of the Internet advertisements (which are the objects<br />

in this data set). However, the multimodal composition of the objects tends to yield superior<br />

–although still poor– quality clusterings, except for the PCA representation.<br />

244

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!