12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.4 Fundamental <strong>Clustering</strong> Strategiessuch situations. For example this can be achieved by using random selection criteriaor adding a little bit of randomness to the selector.One often used method is simulated annealing [67, 68, 39, 38]. Typically a clusteringssimilar to the current clustering is constructed, i.e. by randomly moving avertex to another cluster. The main idea is to direct the search by not accepting eachgenerated clustering. Better clusterings are always accepted. But worse clusteringsare just accepted <strong>with</strong> a probability proportional to the modularity decrease ∆Q.Often the metropolis criterion exp(T −1 ∆Q) is applied. The parameter T controlsthe temperature of the algorithm. In the beginning the temperature is high andmany modularity decreasing moves are accepted. This enables a widespread search.Later the temperature is lowered, increasing the portion of modularity increasingmoves. With zero temperature only better clusterings are accepted and thus thealgorithm finishes in a local optimum.Duch and Arenas [23] proposed a recursive bisection algorithm. The two clustersare computed using extremal optimization applied to an initial random bisection.Extremal optimization can be interpreted as a biased simulated annealing. Themoved vertices are not randomly chosen but selected based on their current contributionto the modularity. This selection criterion is called vertex fitness. The ideais that moving low-fitness vertices also improves the clustering quality.2.4.2 Constructing ComponentsThis subsection presents various heuristics to construct clusterings. They use theselection criteria presented above to direct the construction. Dissection is one ofthe oldest strategies often used in the analysis of social networks. It observes howgraphs and clusters fall apart. More details are presented in the next paragraph. Thesecond paragraph describes agglomeration method. They observe how large clusterscan be put together from smaller clusters. The last paragraph presents refinementmethods that try to improve given clusterings by moving vertices between clusters.Dissection Dissection algorithms repeatedly remove vertices or edges from thegraph and observe the developing connectivity components. Because clusterings ofthe vertex set are searched it is more common to remove edges. Removing an edgecan split a cluster in at most two parts. Thus the produced hierarchical clusteringis a binary tree.Girvan [32] proposed a dissection algorithm which removes edges which are betweenclusters and least central to any cluster. These are identified by counting howmany shortest paths from arbitrary vertex pairs pass through each edge. Less edgesare expected between clusters and thus more paths should touch them. After eachremoval all betweenness values are recalculated. Later Girvan and Newman [59]proposed the random walk betweenness as improved selector. In the same paper themodularity was introduced as measure for the optimal number of clusters.Agglomeration Agglomeration methods grow clusters by merging smaller clusters.The process begins <strong>with</strong> each vertex placed in a separate cluster. In each step a pairof clusters is selected and merged. Greedy methods only merge pairs which increase21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!