29.04.2013 Views

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

TESI DOCTORAL - La Salle

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.2. Clustering in knowledge discovery and data mining<br />

weighted graph whose edges reflect the similarity between each pair of objects. This<br />

makes graph based clustering conceptually similar to hierarchical clustering (Jain and<br />

Dubes, 1988), a good example of which is the Chameleon hierarchical clustering algorithm,<br />

which is based on the k-nearest neighbour graph (Karypis, Han, and Kumar,<br />

1999). Other (non-hierarchical) graph based clustering algorithms are i) Zahn’s algorithm<br />

(Zahn, 1971), which is based on discarding the edges with the largest lengths<br />

on the minimal spanning tree so as to create clusters, ii) CLICK, based on computing<br />

the minimum weight cut to form clusters (Sharan and Shamir, 2000), and iii)<br />

MajorClust, which is based on the weighted partial connectivity of the graph, a measure<br />

whose maximization implicitly determines the optimum number of clusters to be<br />

found (Stein and Niggemann, 1999). <strong>La</strong>stly, the more recent family of spectral clustering<br />

algorithms can be included in the context of graph based clustering. This type<br />

of algorithms are often reported as outperforming traditional clustering techniques<br />

(von Luxburg, 2006). In addition, spectral clustering is simple to implement and can<br />

be solved efficiently by standard linear algebra methods, as, in short, it boils down to<br />

computing the eigenvectors of the <strong>La</strong>placian matrix of the graph. Several variants of<br />

spectral clustering algorithms have been proposed, differing in the way the similarity<br />

graph and the <strong>La</strong>placian matrix are computed (e.g. (Shi and Malik, 2000; Ng, Jordan,<br />

and Weiss, 2002)).<br />

4. Clustering based on combinatorial search: this approach is based on considering clustering<br />

as a combinatorial optimization problem that can be solved by applying search<br />

techniques for finding the global (or approximate global) optimum clustering solution.<br />

In this context, two paradigms have been followed in the design of clustering algorithms:<br />

stochastic optimization methods and deterministic search techniques. Among<br />

the former, some of the most popular approaches are based on evolutionary computation<br />

(e.g. hard or soft clustering based on genetic algorithms (Hall, Özyurt,<br />

and Bezdek, 1999; Tseng and Yang, 2001)), simulated annealing (e.g. (Selim and<br />

Al-Sultan, 1991)), Tabu search (Al-Sultan, 1995) and hybrid solutions (Chu and Roddick,<br />

2000; Scott, Clark, and Pham, 2001), whereas deterministic annealing is the most<br />

typical deterministic search technique applied to clustering (Hofmann and Buhmann,<br />

1997).<br />

5. Clustering based on neural networks: the well-known learning and modelling abilities<br />

of neural networks have been exploited to solve clustering problems. The two most<br />

successful neural networks paradigms applied to clustering are i) competitive learning,<br />

where the Self-Organizing Maps (Kohonen, 1990) and Generalized Learning Vector<br />

Quantization (Karayiannis et al., 1996) play a salient role, and ii) adaptive resonance<br />

theory (Carpenter and Grossberg, 1987), which encompasses a whole family of neural<br />

networks architectures that can be used for hierarchical (Wunsch et al., 1993) and<br />

soft (Carpenter, Grossberg, and Rosen, 1991) clustering.<br />

6. Kernel based clustering: the rationale of kernel-based learning algorithms is simplifying<br />

the task of separating the objects in the data set by nonlinearly transforming them<br />

into a higher-dimensional feature space. Through the design of an inner-product kernel,<br />

the time-consuming and sometimes even infeasible process of explicitly describing<br />

the nonlinear mapping and computing the corresponding data points in the transformed<br />

space can be avoided (Xu and Wunsch II, 2005). A recent example of this<br />

approach is Support Vector Clustering (SVC) (Ben-Hur et al., 2001), that employs<br />

14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!