Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

3 The Multi-Level Refinement Algorithm11mergeedges2a11bmoveedgeab11newself-edgeFigure 3.5: Merging two Vertices and their Edgesselector informs the heap about the new neighbor and selection quality and thevertex is moved up and down through the heap into its correct position. Empirictests suggested that the runtime is already acceptable without the heap. Thus it isdisabled by default to reduce possible sources of error and regression.A temporary coarse graph is maintained to store edge weights and adjacencyinformation for the merge selector. In this graph each cluster is represented by avertex. Merging two clusters equals merging their vertices and incident edges. Thisis known as edge contraction and is the most expensive operation of each merge stepbecause the incident edges of both vertices have to be processed. Let the two involvedvertices be the to be removed and the target vertex. For example in Figure 3.5 thevertex b is to be removed and vertex a is the target. Edges to a common neighborare merged by simply dropping the superfluous edge of the removed vertex. Edgesto a vertex not adjacent to the target vertex have to be moved to it.The merge data structures proposed by Wakita and Tsurumi [76] were implemented.The outgoing edges of each vertex are stored in a double-linked list sortedby the end-vertex index. Now both lists can be merged in one pass without searchingthe matching out-edges of the target vertex. Some special cases arise however:Updating self-edges is difficult because their position is not predictable. Thus theyare always stored as the first out-edge. Unfortunately the original paper leaves openhow to correct the position of moved edges in the list of their end-vertices. Thelists should be sorted by the end-vertex index but moving an edge changes the indexfrom the removed to the target vertex. The simplest fix is to reinsert such edges intothe list, which costs linear time in the size of the list. Therefore worst-case runtimeO(|V | 2 ) of this coarsening method is reached in case all |V | edges of the removedvertex have to be moved to the target vertex without merging edges.The overall worst case time complexity for a complete merge step is O(|V | 2 ):Selecting the best pair costs O(|V |) with a linear search or O(log |V |) using theheap. The two clusters are merged in constant time using union-find. Combiningthe edge lists requires quadratic time O(|V | 2 ) in worst case. Updating the bestneighborinformation also requires a linear search at each adjacent vertex in worstcase. This adds quadratic time O(|V | 2 ). In addition updating the best pair heapmay cost up to O(|V | log |V |). However in sparse graphs the number of incidentedges often is much smaller than the total number of vertices. And in dense graphs30
3.2 Graph Coarseningnearly no edges require movement because they point to a common neighbor. Hencethe average merge time should be nearly linear instead of quadratic.Related Work Clauset at al. use search trees to store the incident edges [15]. Eachedge of the removed vertex is removed from the tree of its end-vertex. Its entryis searched in the list of the target vertex. In case target and end-vertex are notalready adjacent the edge is inserted into both trees. Each vertex can have atmost |V | adjacent vertices, thus merging two vertices with this method would costO(|V | log |V |) time in the worst-case.Greedy grouping is structurally very similar to the minimum spanning tree problemwith the selection quality as edge weights. But most selection qualities change theirvalues dynamically depending on merge steps. Still some alternative techniqueswithout directly contracting edges but deferring some operations to later steps mightbe applicable, cf. [27].3.2.2 Greedy MatchingData: graph,selector,match fraction,reduction factorResult: clusteringforeach pair of adjacent vertices (a,b) do // collect possible mergesif merging a, b would improve quality thensq ← selector quality(a,b);add (a, b, sq) to pair list;num pairs ← num pairs + 1;sort pair list by decreasing sq;good = match fraction ∗ num pairs + 1;merge count = reduction factor ∗ num vertices(graph ) + 1;foreach pair (a, b) in pair list doif a and b not marked thenmark a and b;merge clusters of a and b;merge count ← merge count − 1;good ← good − 1;if merge count = 0 ∨ good = 0 then break;Figure 3.6: Coarsening Method: Greedy MatchingThe greedy matching method operates by first constructing a matching, i.e. alist of independent vertex pairs, and the next coarsening level could be directlygenerated from these pairs. The pseudo-code is shown in Figure 3.6. In the firstphase all pairs of adjacent vertices are collected and sorted by their selection quality.Pairs not improving modularity are filtered out in advance. This list is processedbeginning with the best ranked pairs. Merged vertices are marked and pairs with31
Page 1: Brandenburgische Technische Univers
Page 5 and 6: ContentsList of FiguresList of Tabl
Page 7: List of Figures1.1 Graph of the Mex
Page 11 and 12: 1 IntroductionSince the rise of com
Page 13 and 14: 1.2 Objectives and Outline1.2 Objec
Page 15 and 16: 2 Graph ClusteringThis chapter intr
Page 17 and 18: 2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39: 3.2 Graph CoarseningData: graph,sel
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49 and 50: 3.3 Merge SelectorsAs selection qua
Page 51 and 52: 3.3 Merge Selectorsvectors the eige
Page 53 and 54: 3.4 Cluster Refinementleave the loc
Page 55 and 56: 3.4 Cluster Refinementmoving v from
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59 and 60: 3.4 Cluster RefinementData: graph,c
Page 61 and 62: 3.4 Cluster RefinementModularity0.2
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65 and 66: 3.5 Further Implementation NotesBOO
Page 67: 3.5 Further Implementation Notesfor
Page 70 and 71: 4 Evaluationparameter component des
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91:
4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93:
4 Evaluation(a) jazz(b) celegans me
Page 94 and 95:
4 Evaluationadministrators, and gra
Page 97 and 98:
5 Results and Future WorkThe object
Page 99 and 100:
5.3 Directions for Future Workstrat
Page 101:
5.3 Directions for Future Workties
Page 104 and 105:
BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107:
BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109:
BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111:
A The Benchmark Graph Collectionsub
Page 112 and 113:
B Clustering ResultsRWreach-sgrd 1
Page 114:
B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?