29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

B.1. Clustering indeterminacies in unimodal data sets<br />

B.1.7 Balance data set<br />

The approximately even distributions of the φ (NMI) histograms corresponding to the four<br />

object representations employed in the Balance data set (with the exception of the peak<br />

around φ (NMI) =0.04 in figure B.7(c)) transmit the idea that the chance of randomly selecting<br />

a good or a bad clustering configuration is rather equiprobable in this data collection.<br />

clustering count<br />

10<br />

5<br />

Balance Baseline<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(a) Baseline<br />

clustering count<br />

10<br />

5<br />

Balance PCA<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(b) PCA<br />

clustering count<br />

10<br />

5<br />

Balance NMF<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(c) NMF<br />

clustering count<br />

10<br />

5<br />

Balance RP<br />

0<br />

0 0.5 1<br />

φ (NMI)<br />

(d) RP<br />

Figure B.7: Histograms of the φ (NMI) values obtained on the each data representation in<br />

the Balance data set.<br />

B.1.8 MFeat data set<br />

In this data set, six distinct feature types were employed for representing the objects, each<br />

with a single dimensionality. Therefore, the φ (NMI) scatter observed in each of the figures<br />

from B.8(a) to B.8(f) is solely due to the algorithm selection indeterminacy.<br />

Notice that, in all these histograms, a pretty high density of clustering solutions around<br />

φ (NMI) =0.5 can be observed. Nevertheless, notably better clustering results (φ (NMI) ≈ 0.8)<br />

can be obtained using the KAR and PIX object representations (see figures B.8(c) and<br />

B.8(e)), which reveals the data representation indeterminacy effect.<br />

B.1.9 miniNG data set<br />

The wide spread of the φ (NMI) values observed in figure B.9(a) –from φ (NMI) =0.06 to<br />

φ (NMI) =0.64– is a clear evidence of how the selection of a particular clustering algorithm<br />

affects the quality of the clustering results.<br />

Moreover, notice that the clustering solutions obtained on the RP representation yield<br />

φ (NMI) values below 0.3, whereas the best results obtained on the remaining representations<br />

reach and even surpass φ (NMI) =0.5 —i.e. distinct object representations can significantly<br />

alter the results of a clustering process.<br />

B.1.10 Segmentation data set<br />

As regards the effect of applying distinct clustering algorithms on the same object representation,<br />

figure B.10(a) shows how, despite the accumulation of clustering solutions around<br />

φ (NMI) =0.35, a maximum quality of φ (NMI) =0.65 can be obtained on the baseline represenmtation<br />

of the objects.<br />

238

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!