29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

φ (NMI)<br />

0.8<br />

0.78<br />

0.76<br />

0.74<br />

0.72<br />

0 100 200 300<br />

clustering index<br />

(a) Decreasingly ordered<br />

φ (NMI) (wrt ground truth)<br />

Chapter 4. Self-refining consensus architectures<br />

φ (ANMI)<br />

0.8<br />

0.75<br />

0.7<br />

0.65<br />

0 100 200 300<br />

clustering index<br />

(b) φ (ANMI) values wrt the<br />

toy cluster ensemble<br />

Figure 4.2: Decreasingly ordered φ (NMI) (wrt ground truth) values of the 300 clusterings<br />

included in the toy cluster ensemble (left), and their corresponding φ (ANMI) values (wrt the<br />

toy cluster ensemble) (right).<br />

Consensus Consensus function<br />

architecture CSPA EAC HGPA MCLA ALSAD KMSAD SLSAD<br />

flat 8.8 14.5 17.8 12 6.7 8.9 12.1<br />

RHCA 6.3 26.6 24.7 27.3 8.6 8.4 16.9<br />

DHCA 15.9 27 22.9 19.6 9.4 11 8.7<br />

Table 4.10: Relative percentage φ (NMI) losses due to suboptimal self-refined consensus clustering<br />

solution selection by supraconsensus, averaged across the twelve data collections.<br />

ground truth. Notice how the monotonic decreasing behaviour of φ (NMI) –figure 4.2(a)– is<br />

not strictly observed in φ (ANMI) (see figure 4.2(b), where a fifth order fitting red dashed<br />

curve is overlayed for comparison). In fact, the clustering attaining the maximum φ (ANMI)<br />

is the one with the fiftieth largest φ (NMI) . Thus, in practice, φ (ANMI) seems to constitute<br />

a means for identifying good clustering solutions, but not the best one. For this reason,<br />

it seems that requiring φ (ANMI) (E, λc) to select the one self-refined consensus clustering<br />

solution of highest quality is a far too restrictive constraint, which leads to the slightly<br />

disappointing results presented in table 4.9.<br />

In order to evaluate the influence of the apparent lack of precision of the supraconsensus<br />

function, we have measured the relative percentage φ (NMI) loss derived from a suboptimal<br />

consensus solution selection, using the top quality consensus clustering solution as a reference<br />

(i.e. the one that should be selected by an ideal supraconsensus function). The results,<br />

which are presented in table 4.10, show that the impact of the modest selection accuracy<br />

of the supraconsensus function leads to an average relative φ (NMI) loss of 14.9%.<br />

To conclude, it can be asserted that, while the proposed consensus self-refining procedure<br />

introduces notable gains as regards the quality of consensus clustering solutions, there<br />

still exists room for taking full advantage of its performance, as the entirely unsupervised<br />

selection of the highest quality consensus solution is not a fully solved problem yet.<br />

121

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!