29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1 1 1 3 3 3 2 2 2<br />

1<br />

2<br />

3<br />

2 2 2 1<br />

2 2 2 3<br />

1 1 3 3<br />

3 3 1 1<br />

3<br />

1<br />

Chapter 2. Cluster ensembles and consensus clustering<br />

E<br />

Consensus<br />

function c<br />

1<br />

1 1 2 2<br />

2 3 3 3<br />

Figure 2.2: Schematic representation of the obtention of a consensus labeling λc by applying<br />

a consensus function F on a hard cluster ensemble E containing l = 3 individual label<br />

vectors, being n =9andk =3.<br />

⎛<br />

0.921 0.932 0.905 0.006 0.005 0.011 0.009 0.016<br />

⎞<br />

0.010<br />

⎜<br />

0.054<br />

⎜<br />

0.025<br />

⎜<br />

⎜0.025<br />

E = ⎜<br />

⎜0.920<br />

⎜<br />

⎜0.054<br />

⎜<br />

⎜0.054<br />

⎝0.920<br />

0.042<br />

0.026<br />

0.026<br />

0.932<br />

0.042<br />

0.042<br />

0.932<br />

0.057<br />

0.038<br />

0.038<br />

0.905<br />

0.057<br />

0.057<br />

0.905<br />

0.025<br />

0.969<br />

0.969<br />

0.006<br />

0.025<br />

0.025<br />

0.006<br />

0.019<br />

0.976<br />

0.976<br />

0.005<br />

0.019<br />

0.019<br />

0.005<br />

0.030<br />

0.959<br />

0.959<br />

0.011<br />

0.030<br />

0.030<br />

0.011<br />

0.976<br />

0.014<br />

0.014<br />

0.009<br />

0.976<br />

0.976<br />

0.009<br />

0.929<br />

0.055<br />

0.055<br />

0.016<br />

0.929<br />

0.929<br />

0.016<br />

0.972 ⎟<br />

0.017 ⎟<br />

0.017 ⎟<br />

0.010 ⎟<br />

0.972 ⎟<br />

0.972 ⎟<br />

0.010⎠<br />

0.025 0.026 0.038 0.969 0.976 0.959 0.014 0.055 0.017<br />

(2.4)<br />

Notice that, just like it was observed in section 1.2.1 regarding crisp and fuzzy clustering<br />

solutions, soft cluster ensembles can be converted to hard cluster ensembles by assigning<br />

each object to the cluster it is more strongly associated to. In fact, by doing so, the soft<br />

ensemble in equation (2.4) would be converted to the hard cluster ensemble in equation<br />

(2.2). Moreover, notice that the l = 3 components that make up both cluster ensembles are<br />

identical, given the symbolic nature of cluster labels.<br />

As for consensus clustering, it is defined as the process of obtaining a consolidated<br />

clustering solution through the application of a consensus function F on a cluster ensemble<br />

E (Strehl and Ghosh, 2002). In other words, consensus clustering can be regarded as the<br />

problem of combining several clustering solutions without accessing the features representing<br />

the clustered objects. Figure 2.2 depicts a schematic representation of a consensus clustering<br />

process conducted on the hard cluster ensemble resulting from our toy example. In this<br />

case, the result of the consensus clustering process is a consensus label vector λc which,<br />

quite obviously, represents the same partition as the individual label vectors that compose<br />

the cluster ensemble. However, in a real context, a higher degree of diversity among the<br />

clustering solutions embedded in the cluster ensemble can be expected (which, in fact, is<br />

desirable), a situation consensus clustering algorithms take advantage of for consolidating<br />

richer consensus clustering solutions (Pinto et al., 2007).<br />

Quite obviously, the design of the consensus function F is a central issue as regards<br />

consensus clustering. Most works in the consensus clustering literature focus on combining<br />

the outcomes of hard clustering processes (as in the example depicted in figure 2.2), although<br />

some consensus functions can be applied to either hard or soft cluster ensembles indistinctly,<br />

possibly after introducing some minor modifications (Strehl and Ghosh, 2002; Fern and<br />

Brodley, 2004; <strong>La</strong>nge and Buhmann, 2005). However, little effort has been conducted<br />

towards the design of specific consensus functions for soft cluster ensembles that generate<br />

29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!