29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

List of Figures<br />

1.1 Evolution of the total number of websites across all Internet domains, from<br />

November 1995 to February 2009 . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.2 Schematic diagram of the steps involved in a knowledge discovery process . 4<br />

1.3 Taxonomy of data mining methods . . . . . . . . . . . . . . . . . . . . . . . 6<br />

1.4 Toy example of a hierarchical clustering dendrogram . . . . . . . . . . . . . 10<br />

1.5 Illustration of the data representation indeterminacy on the Wine and miniNG<br />

data sets clustered by the rbr-corr-e1 algorithm. . . . . . . . . . . . . . 21<br />

1.6 Block diagram of the robust multimodal clustering system based on selfrefining<br />

hierarchical consensus architectures . . . . . . . . . . . . . . . . . . 26<br />

2.1 Scatterplot of an artificially generated two-dimensional toy data set containing<br />

n = 9 objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />

2.2 Schematic representation of a consensus clustering process on a hard cluster<br />

ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />

3.1 Flat vs hierarchical construction of a consensus clustering solution on a hard<br />

cluster ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />

3.2 Three examples of topologies of random hierarchical consensus architectures 52<br />

3.3 Evolution of RHCA parameters as a function of the mini-ensembles size b . 54<br />

3.4 Estimated and real running times of the serial and parallel RHCA implementations<br />

on the Zoo data collection in the |dfA| = 1 diversity scenario . . . . 58<br />

3.5 Estimated and real running times of the serial and parallel RHCA implementations<br />

on the Zoo data collection in the |dfA| = 10 diversity scenario . . . . 59<br />

3.6 Estimated and real running times of the serial and parallel RHCA implementations<br />

on the Zoo data collection in the |dfA| = 10 diversity scenario . . . . 61<br />

3.7 Estimated and real running times of the serial and parallel RHCA implementations<br />

on the Zoo data collection in the |dfA| = 28 diversity scenario . . . . 62<br />

3.8 Evolution of the accuracy of RHCA running time estimation as a function of<br />

the number of consensus processes . . . . . . . . . . . . . . . . . . . . . . . 65<br />

3.9 An example of a deterministic hierarchical consensus architecture . . . . . . 71<br />

3.10 Estimated and real running times of the serial and parallel dHCA implementations<br />

on the Zoo data collection in the |dfA| = 1 diversity scenario . . . . 77<br />

xxiii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!