29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

C.4. Computationally optimal RHCA, DHCA and flat consensus comparison<br />

C.4.7 MFeat data set<br />

This section describes the performance of the minimum complexity RHCA and DHCA<br />

serial and parallel variants in the case of the MFeat data collection, which are compared to<br />

classic flat consensus in terms of the time required for their execution and the quality of the<br />

consensus clustering solutions they yield. Cluster ensembles of sizes l =6, 60, 114 and 168<br />

correspond to the four diversity scenarios where these experiments are conducted.<br />

Running time comparison<br />

Figure C.47 presents the running times of flat consensus and of the serial implementations<br />

of RHCA and DHCA. Notice that, except when the HGPA and MCLA consensus functions<br />

are employed, flat consensus is faster than any of its hierarchical counterparts regardless of<br />

the size of the cluster ensemble (i.e. it is faster in all the diversity scenarios).<br />

When the parallel implementation of the HCA is considered (see figure C.48), the observed<br />

behaviour is very similar to the one that has been just reported. That is, flat consensus<br />

is the most computationally efficient consensus architecture, except when consensus<br />

functions based on hypergraph partition are employed. This is due to the fact that, on the<br />

MFeat data collection, the low cardinality of the diversity factors gives rise to relatively<br />

small cluster ensembles, which makes flat consensus a competitive alternative to hierarhical<br />

consensus architectures.<br />

Consensus quality comparison<br />

Figure C.49 presents the quality of the consensus clustering solutions yielded by the flat<br />

and hierarhical consensus architectures, under the shape of φ (NMI) boxplot diagrams. An<br />

inter-consensus function analysis reveals that EAC, HGPA and SLSAD yield, in general<br />

terms, the lowest quality results, while CSPA, ALSAD and KMSAD stand out as the best<br />

performing consensus functions, as they yield consensus clustering solutions the quality of<br />

which is comparable to that of the cluster ensemble components that best reveal the true<br />

cluster structure of the data set (i.e. those attaining the highest φ (NMI) values). Meanwhile,<br />

if an intra-consensus function study is conducted, we can conclude that whereas the three<br />

consensus architectures yield pretty similar quality consensus solutions when based on CSPA<br />

and ALSAD, larger differences between RHCA, DHCA and flat consensus are observed in<br />

other cases, as when consensus clustering is conducted by means of the EAC, HGPA, MCLA<br />

or SLSAD consensus functions.<br />

C.4.8 miniNG data set<br />

In this section, we present the running times and quality evaluation (by means of φ (NMI)<br />

values) of the consensus clustering processes implemented by means of the serial and parallel<br />

RHCA and DHCA implementations and flat consensus on the miniNG data collection.<br />

The cluster ensembles sizes corresponding to the four diversity scenarios in which our experiments<br />

are conducted are l =73, 730, 1387 and 2044.<br />

314

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!