29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 6. Voting based consensus functions for soft cluster ensembles<br />

CPU time BC CC PC SC<br />

faster than ... 45.4% 72.7% 18.2% 18.2%<br />

CSPA equivalent to ... 18.2% 0% 18.2% 18.2%<br />

slower than ... 36.4% 27.3% 63.6% 63.6%<br />

faster than ... 9.1% 27.3% 9.1% 9.1%<br />

EAC equivalent to ... 27.3% 9.1% 0% 0%<br />

slower than ... 63.6% 63.6% 90.9% 90.9%<br />

faster than ... 91.7% 91.7% 33.3% 33.3%<br />

HGPA equivalent to ... 8.3% 8.3% 41.7% 41.7%<br />

slower than ... 0% 0% 25% 25%<br />

faster than ... 66.7% 83.3% 16.7% 16.7%<br />

MCLA equivalent to ... 25% 16.7% 41.7% 41.7%<br />

slower than ... 8.3% 0% 41.7% 41.7%<br />

faster than ... 100% 100% 91.7% 91.7%<br />

VMA equivalent to ... 0% 0% 8.3% 8.3%<br />

slower than ... 0% 0% 0% 0%<br />

Table 6.4: Percentage of experiments in which the state-of-the-art consensus functions<br />

(CSPA, EAC, HGPA, MCLA and VMA) are executed (statistically significantly)<br />

faster/equivalent/slower than the four proposed consensus functions (BC, CC, PC and SC).<br />

confidence values contained in the soft cluster ensemble are difficult to scale correctly (van<br />

Erp, Vuurpijl, and Schomaker, 2002).<br />

6.5 Discussion<br />

The main motivation of the proposals put forward in this chapter is the fact that most of the<br />

literature on cluster ensembles is mainly focused on the application of consensus clustering<br />

processes on hard cluster ensembles. In our opinion, however, soft consensus clustering is<br />

an alternative worth considering, inasmuch as crisp clustering is in fact a simplification of<br />

fuzzy clustering —a simplification that may give rise to the loss of valuable information.<br />

The initial source of inspiration for the soft consensus functions just presented was<br />

metasearch (aka information fusion) systems, the main purpose of which is to obtain improved<br />

search results by combining the ranked lists of documents returned by multiple<br />

search engines in response to a given query. Although the resemblance between metasearch<br />

and consensus clustering was already reported in (Gionis, Mannila, and Tsaparas, 2007),<br />

direct inspiration came from the works of Aslam and Montague (Aslam and Montague,<br />

2001; Montague and Aslam, 2002), where metasearch algorithms based on positional voting<br />

were devised —notice that this type of voting techniques lend themselves to be applied in<br />

this context, as search engines return lists of ranked documents. From that point on, the<br />

analogy between object-to-cluster association scores in a soft cluster ensemble and voters’<br />

preferences for candidates became the key issue for deriving consensus functions based on<br />

positional and confidence voting methods.<br />

Nevertheless, the application of voting methods for combining clustering solutions is not<br />

new. For instance, unweighed voting strategies (van Erp, Vuurpijl, and Schomaker, 2002)<br />

such as plurality and majority voting have been applied for deriving consensus clustering<br />

189

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!