Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

3 The Multi-Level Refinement AlgorithmSubstantial differences to the classic k-way partition refinement 1 exist: The numberof clusters is not predetermined and no balance constraints on their size aregiven. Instead determining the right number of clusters has to be achieved by thealgorithm. Clusters may be created and emptied as necessary. The modularityclustering quality is more complex than the minimum edge cut because the volumeintroduced global dependencies. Due to the multi-level approach it suffices to movesingle vertices instead of constructing vertex groups like in the original Kernighan-Lin algorithm.The algorithms are dissected into the components listed below. The next paragraphsdiscuss these components in more detail and comment on their dependencies.The section concludes with an overview how basic refinement algorithms are fit intothis design space.Vertex Ranking Which vertices should be selected for movement?Target Ranking To where should selected vertices be moved?Search Mode Which vertices and targets are evaluated before selecting one of them?Ranking Updates Is the vertex ranking updated after each move or fixed?On Modularity Decrease What to do with selected moves not improving modularity?maxmodVertex and Target Ranking The decision which vertices are moved to which clusteris split into two aspects. The vertex ranking directs the selection of vertices and thetarget ranking selected the cluster. The vertex ranking is used as a predictor similarto the merge selector in the coarsening phase and should direct the refinement intogood, modularity-increasing directions.Given vertex v, its current cluster C(v), and a target cluster j ∈ C ∪ {∅} thechange of modularity ∆ v,j Q(C) = Q(C ′ ) − Q(C) is calculated as shown in theequation below. The cluster ∅ is used for new, empty clusters. By definition edgeweights and volumes involving this cluster are zero. The maxmod vertex rankingsorts the vertices by their best increase of modularity maxmod(v) = max j ∆ v,j Q(C).The single implemented target ranking is the maxmod ranking which selects theneighbor cluster with maximal increase of modularity arg max j ∆ v,j Q(C).∆ v,j Q(C) = f(v, C j − v) − f(v, C[v] − v)− Vol(v, C j − v) − Vol(v, C[v] − v)(3.15)f(V )Vol(V )f(v, C[v] − v) Vol(v, C[v] − v)∆ v,∅ Q(C) = − + (3.16)f(V )Vol(V )The first term depends just on the adjacent vertices as shown in Figure 3.9.To compute the edge weights f(v, C j − v) all incident edges of v are visited andtheir weight is summed in an array grouped by the cluster of the end-vertex. Thus1 k equal sized clusters, minimum edge cut44
3.4 Cluster Refinementmoving v from C(v) to jcurrent cluster:f(v, C[v] − v)vother clusters:no changetarget cluster:f(v, C j − v)Figure 3.9: Dependencies when Moving a Vertexcomputing the maxmod ranking costs linear time in the number of incident edgesand the number of clusters. In average this is O(|E|/|V | + |C|).As visible the change of modularity depends through the volume term on theglobal clustering structure and changes with nearly each vertex move: The volumeis computed according to the volume model with Vol(v, C j − v) = (w(v) − 1)w(C j ).The latter value is the sum over all vertex weights in the cluster. To evaluate thevolume in constant time this value is stored per cluster and iteratively updatedeach time a vertex moves out or into the cluster. This global dependency is the keypoint which makes modularity refinement more expensive then the standard min-cutrefinement.Instead of ranking vertices by the expensive to compute modularity increase goodcandidate vertices may also be identified by their current contribution to the modularityf(v, C[v]) − ρ(V ) Vol(v, C[v]). Vertices with a low or negative contributionhave a good chance of not belonging to their current cluster. Moving a vertex doesnot modify the contribution of its self-edge. Therefore omitting self-edges leavesmod(v) = f(v, C[v] − v) − ρ(V ) Vol(v, C[v] − v). This computation can be done inconstant time by storing f(v, C[v] − v) for each vertex and iteratively updating itwhen adjacent vertices are moved.Based on the contribution to modularity several vertex rankings were derived. Itcan be assumed that the placement of high-degree vertices has a higher influence onthe modularity. To suppress this effect the modularity contribution can be dividedby its degree. Or to stay consistent with the density model, it can be scaled bythe inter-volume between the vertex and its cluster. Since these vertex rankings arestructurally and semantically similar to the vertex fitness used in extremal optimization[23] they are called fitness measures. In contrast to other vertex rankings herelower values are preferred. In summary these are:mod-fitness The modularity contribution mod(v) = f(v, C[v]−v)−ρ(V ) Vol(v, C[v]−v).fitnesseo-fitness The vertex fitness used in extremal optimization eof(v) = mod(v)/ deg(v).density-fitness The density of connections to the current cluster ρ(v, C[v] − v) =f(v, C[v]−v)/V ol(v, C[v]−v). This is ranking equivalent to mod(v)/[deg(v) deg(C[v]−v)] in the degree multiplicity volume model.45
Page 1:
Brandenburgische Technische Univers
Page 5 and 6: ContentsList of FiguresList of Tabl
Page 7: List of Figures1.1 Graph of the Mex
Page 11 and 12: 1 IntroductionSince the rise of com
Page 13 and 14: 1.2 Objectives and Outline1.2 Objec
Page 15 and 16: 2 Graph ClusteringThis chapter intr
Page 17 and 18: 2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39 and 40: 3.2 Graph CoarseningData: graph,sel
Page 41 and 42: 3.2 Graph Coarseningnearly no edges
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49 and 50: 3.3 Merge SelectorsAs selection qua
Page 51 and 52: 3.3 Merge Selectorsvectors the eige
Page 53: 3.4 Cluster Refinementleave the loc
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59 and 60: 3.4 Cluster RefinementData: graph,c
Page 61 and 62: 3.4 Cluster RefinementModularity0.2
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65 and 66: 3.5 Further Implementation NotesBOO
Page 67: 3.5 Further Implementation Notesfor
Page 70 and 71: 4 Evaluationparameter component des
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91: 4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93: 4 Evaluation(a) jazz(b) celegans me
Page 94 and 95: 4 Evaluationadministrators, and gra
Page 97 and 98: 5 Results and Future WorkThe object
Page 99 and 100: 5.3 Directions for Future Workstrat
Page 101: 5.3 Directions for Future Workties
Page 104 and 105:
BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107:
BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109:
BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111:
A The Benchmark Graph Collectionsub
Page 112 and 113:
B Clustering ResultsRWreach-sgrd 1
Page 114:
B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?