Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

3 The Multi-Level Refinement AlgorithmData: graph,clustering,selectorResult: clusteringcurrent ← clustering ;// outer looprepeatstart ← current;peak ← current;mark all vertices as unmoved ;// inner loopwhile unmoved vertices exist dov ← selector:find best ranked, unmoved vertex;j ← selector:find best target cluster for v;mark v as moved;if modularity is decreasing ∧ current better than peak thenpeak ← current;move v to cluster j and update selector;if peak better than current thencurrent ← peak;until current not better than start ;clustering ← current;Figure 3.12: Refinement Method: basic Kernighan-Linchecked at modularity decreasing moves in the inner loop. In order to not directlyrevert modularity decreasing moves processed vertices are marked and ignored inlater inner iterations. In case the inner loop found an improved clustering it iscontained in the current or the peak clustering. This is checked after the inner loopand the best is returned to the outer loop. Altogether the outer loop restarts therefinement on the best intermediate clustering until no further improvements werefound.Just the Kernighan-Lin algorithm with maxmod vertex ranking fully moves intolocal optima. Its inner loop starts like greedy refinement and performs modularityincreasing moves until reaching a local optimum and suppressed moves of markedvertices are performed by the second outer iteration. Unfortunately this is not sharedby the other vertex rankings. When vertices with no modularity increasing movesare proposed before reaching a local optimum they are moved. Thus it becomesunlikely to reach local optima.Similar to greedy refinement with maxmod vertex ranking the time required fora complete run of the inner loop is O(|V |(|E| + |V | max |C|)). But in contrastto the complete greedy refinement this algorithm does not abort in local optima.This makes the basic algorithm very expensive which is also confirmed by empiricexperience.Creation of Clusters The creation of clusters is an aspect specific to modularityclustering because the optimal number of clusters is not known. Initial observationswith the basic algorithm discovered an excessive creation of new clusters in theinner loop. For example the scatter plot in Figure 3.13a shows the relation between50
3.4 Cluster RefinementModularity0.2 0.3 0.4 0.5 0.6●●● ●●● ●● ●●●●●●●●●●●●●●● ● ●●● ● ●● ●●●peak●●●● ●Modularity0.54 0.56 0.58 0.60 0.62 0.64●peak● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●20 40 60 80 100 120Clusters0 500 1000 1500 2000Clusters(a) Modularity and Cluster Count4000 6000 8000 10000 12000 16000Move(b) Dynamics of Kernighan-Lin RefinementFigure 3.13: Kernighan-Lin Refinement Creating Clusters on Graph Epa mainmodularity and the number of clusters at the end of inner loops. In addition theposition of peak clusterings is shown by blue circles. Moreover Figure 3.13b showsthe beginning of multi-level Kernighan-Lin refinement in the same graph. The blackline plots the modularity and the red line the number of clusters. Again blue circlesmark cluster counts at peak clusterings. Both graphics underline that all peakclusterings are found with a small number of clusters while the algorithm tends tocreate a huge number of clusters.This behavior can be explained as follows: After all vertices fitting well into anothercluster are moved the algorithm continues with vertices very well connectedto their current cluster. Often moving them into a new cluster decreases the modularityless than moving them into other existing clusters. Thus a cluster is createdfor that vertex.The excessive creation of clusters brings two big problems: Firstly the searchfor better clusterings is quickly directed to low-modularity clusterings and exploresonly few with cluster counts near the optimum. This wastes time in uninterestingparts of the clustering space. But more importantly the time complexity of thewhole algorithm is coupled to the number of clusters. Thus high cluster countssignificantly impair the performance.Effective Search Depth The moves performed by the algorithm are interpretableas a depth-first search into the space of possible clusterings with the vertex andtarget ranking determining the direction. The complete execution of the inner loopis very time consuming. However already after a small number of moves the searchis far away from optimal clusterings and cannot return back because of the markedvertices.In order to safely abort the inner loop earlier the number of moves between peakclusterings is analyzed. For this purpose let the effective search depth be the max-51
Page 1:
Brandenburgische Technische Univers
Page 5 and 6:
ContentsList of FiguresList of Tabl
Page 7:
List of Figures1.1 Graph of the Mex
Page 11 and 12: 1 IntroductionSince the rise of com
Page 13 and 14: 1.2 Objectives and Outline1.2 Objec
Page 15 and 16: 2 Graph ClusteringThis chapter intr
Page 17 and 18: 2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39 and 40: 3.2 Graph CoarseningData: graph,sel
Page 41 and 42: 3.2 Graph Coarseningnearly no edges
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49 and 50: 3.3 Merge SelectorsAs selection qua
Page 51 and 52: 3.3 Merge Selectorsvectors the eige
Page 53 and 54: 3.4 Cluster Refinementleave the loc
Page 55 and 56: 3.4 Cluster Refinementmoving v from
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59: 3.4 Cluster RefinementData: graph,c
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65 and 66: 3.5 Further Implementation NotesBOO
Page 67: 3.5 Further Implementation Notesfor
Page 70 and 71: 4 Evaluationparameter component des
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91: 4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93: 4 Evaluation(a) jazz(b) celegans me
Page 94 and 95: 4 Evaluationadministrators, and gra
Page 97 and 98: 5 Results and Future WorkThe object
Page 99 and 100: 5.3 Directions for Future Workstrat
Page 101: 5.3 Directions for Future Workties
Page 104 and 105: BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107: BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109: BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111:
A The Benchmark Graph Collectionsub
Page 112 and 113:
B Clustering ResultsRWreach-sgrd 1
Page 114:
B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?