12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.2 Objectives and Outline1.2 Objectives and OutlineThe main objective of this work is the implementation and evaluation of effective andefficient clustering algorithms for the modularity quality measure of Newman [60].Optimizing the modularity is a NP-complete problem [13]. Therefore it is unlikelythat a fast, polynomial time algorithm finds a clustering of optimal modularity inall graphs. Instead relatively good clusterings have to suffice.The major design goal for the clustering algorithm is the effectiveness. The developedalgorithm shall be able to find clusterings of very good modularity comparedto alternative methods. Secondly the efficiency is a concern. Complicated and expensivealgorithms should replaced by simpler alternatives in case they are not ableto produce significantly better clusterings.This work focuses on multi-level refinement heuristics. In the past these weresuccessfully applied on similar problems like minimum cut partitioning [41, 43, 78].However modularity clustering differs significantly from these. The optimizationaim is for example more complex because the number of clusters is not known inadvance but has to be discovered by the algorithm. To date no detailed study existson the adaption of multi-level refinement to modularity clustering. In this contextalso a comparison of the effectiveness and efficiency to other clustering algorithmsis necessary.Chapter 2 introduces the mathematical formulation of the modularity clusteringaim and discusses related concepts. This analysis provides the base for the laterdevelopment of strategies specific to the modularity. The chapter concludes in Section2.4 <strong>with</strong> the study of existing clustering algorithms. From these fundamentalstrategies and components are derived. The developed multi-level algorithm willcombine several particularly useful components to benefit from their advantages.<strong>Based</strong> on the insights in the mathematical structure and fundamental clusteringstrategies Chapter 3 presents the developed family of algorithms. The algorithms arebased on multi-level refinement methods adapted to the special needs of modularityclustering. The algorithm operates in two phases. First a hierarchy of coarsenedgraphs is produced. Then this hierarchy is used in the refinement of an initialclustering.The coarsening phase involves the selection of pairs of clusters to be merged. Asub-goal of this work is the exploration of alternative selection criteria in order toimprove the effectiveness and overall clustering results. This includes criteria derivedfrom random walks and spectral methods like presented in Section 3.3. In additionthe implementation should incorporate appropriate data structures and algorithmsto efficiently support these merge selector.The second major component of the algorithm are cluster refinement heuristics.These try to improve an initial clustering by exploring similar clusterings and selectingone of them. Here such neighbor clusterings are generated by moving singlevertices. The objective of Section 3.4 is the exploration of heuristics for the searchof local optima, i.e. clusterings <strong>with</strong>out any neighbor clusterings of better modularity.Here the efficiency is an important concern because many traditional heuristics,like Kernighan-Lin refinement [45, 24], cannot be efficiently implemented for themodularity measure.3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!