Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

4 Evaluationparameter component description+values defaultcoarsening method coarsening method to merge cluster pairs;greedy grouping, greedy matchingreduction factor coarsening number of clusters to merge ineach coarsening level; 5%–50%match fraction coarsening number of best ranked pairsto consider in matching; 50%–100%merge selector coarsening ranking of cluster pairs;modularity increase, weight density,RW distance, RW reachability,spectral length, spectrallength difference, spectral anglegreedy grouping10%50%weight densityRW steps selector length of random walks 2RW iterations selector iterative applications of reachability3spectral ratio selector number of eigenvectors to use,20%cut off value for λ j /λ 1spectral max ev selector maximal number of eigenvectors 30refinement method refinement method to move vertices;complete greedy, sorted greedy,Kernighan-Linvertex selector refinement ranking of vertices insorted greedy refinement;mod-fitness, eo-fitness, densityfitnesssearch depth refinement when to abort inner-loop searchin Kernighan-Lin, multiplied bylog 10 |V |sorted greedydensity fitness25Table 4.1: The Configuration Space60
4.1 Methods and Dataof merge selectors. The parameters are grouped into a coarsening and refinementphase depending on what they directly influence. Some merge selectors are furtherconfigurable. Their parameters are placed in the component called selector. Notethat not all parameters are meaningful in all combinations. For example the matchfraction is used only by greedy matching. The last column contains the defaultvalues used throughout the evaluation were nothing different is stated.The parameters controlling spectral merge selectors and the search depth are fixedand their influence will not be further evaluated for following reasons. The vertexvectors used in spectral merge selector always are approximations. Using moreeigenvectors should improve the accuracy, but requires more memory and time tocompute them. Thus the maximal number of eigenvectors is limited to 30. In graphswith less than 30 clusters also less eigenvectors contain meaningful information. Byassuming that the positive eigenvalues quickly decay to zero only the few eigenvectorswith eigenvalues larger than 20% of the largest eigenvalue are used. The searchdepth is only used by the Kernighan-Lin refinement. Previous experiments in Section3.4.3 on refinement algorithms showed that it can be restricted to 20 log 10 |V |moves without missing better clusterings. For safety the factor 25 will be used.4.1.2 EffectivenessThe main tool used in this evaluation is the comparison of mean modularities. Forthis purpose clusterings produced by specific algorithm configurations on a set ofgraphs are collected. The mean modularity of a configuration over the graphs iscalculated using the arithmetic mean.Of course graphs with high modularity most influence the absolute value of themean modularity. But the modularity measure is normalized to the range from zeroto one. Thus all modularity values will be in the same range regardless of differinggraph properties. In addition here the absolute values are unimportant as algorithmsand configurations are compared. The differences between mean modularities aremost influenced by graphs with high modularity variations between the algorithms.In many places also the results gained with and without refinement will be compared.Without refinement it becomes visible how much parameters influence theraw graph coarsening. On the other hand results with refinement indicate how wellcoarsening with the observed parameters supports the refinement algorithms later.Kernighan-Lin refinement is expected to be more resistant against difficulties as itcan escape from local optima. Hence instead just sorted greedy refinement with thedensity-fitness vertex selector is used for this comparisons. This reference configurationwill be abbreviated by sgrd in most places.As a useful side effect the modularity improvements with refinement over no refinementprovide a significance scale. Fluctuations of the modularity much smallerthan this scale are probably not of much interest. Similarly improvements of differentcoarsening configurations which are easily compensated by greedy refinementare neither important.For many questions of the evaluation tables containing mean modularities willbe produced. However it is difficult to compare these raw values. Therefore wherepossible the mean modularity results will be plotted and graphically visualized. The61
Page 1:
Brandenburgische Technische Univers
Page 5 and 6:
ContentsList of FiguresList of Tabl
Page 7:
List of Figures1.1 Graph of the Mex
Page 11 and 12:
1 IntroductionSince the rise of com
Page 13 and 14:
1.2 Objectives and Outline1.2 Objec
Page 15 and 16:
2 Graph ClusteringThis chapter intr
Page 17 and 18:
2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39 and 40: 3.2 Graph CoarseningData: graph,sel
Page 41 and 42: 3.2 Graph Coarseningnearly no edges
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49 and 50: 3.3 Merge SelectorsAs selection qua
Page 51 and 52: 3.3 Merge Selectorsvectors the eige
Page 53 and 54: 3.4 Cluster Refinementleave the loc
Page 55 and 56: 3.4 Cluster Refinementmoving v from
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59 and 60: 3.4 Cluster RefinementData: graph,c
Page 61 and 62: 3.4 Cluster RefinementModularity0.2
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65 and 66: 3.5 Further Implementation NotesBOO
Page 67: 3.5 Further Implementation Notesfor
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91: 4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93: 4 Evaluation(a) jazz(b) celegans me
Page 94 and 95: 4 Evaluationadministrators, and gra
Page 97 and 98: 5 Results and Future WorkThe object
Page 99 and 100: 5.3 Directions for Future Workstrat
Page 101: 5.3 Directions for Future Workties
Page 104 and 105: BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107: BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109: BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111: A The Benchmark Graph Collectionsub
Page 112 and 113: B Clustering ResultsRWreach-sgrd 1
Page 114: B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?