Multilevel Graph Clustering with Density-Based Quality Measures

More documents

Recommendations

Info

$Eine Einführung in LaTeX-Beamer - studiy - Brandenburgische ...$

3 The Multi-Level Refinement Algorithm˜x u˜X 1˜X 2 = ˜x u + ˜x v˜x vFigure 3.8: Spectral vertex vectors and two cluster vectorsTheoretical Background The modularity can also be described as sum over allvertex pairs using the Kronecker delta δ(C(u), C(v)) as filter with δ(C(u), C(v)) = 1only for pairs of the same cluster and zero elsewhere. This representation of themodularity is:Q(C) = ∑ [ ]f(u, v) Vol(u, v)− δ(C(u), C(v)) (3.10)f(V ) Vol(V )u,vmodularitymatrixThe inner part of the sum is replaced by the modularity matrix M u,v = f(u, v)/f(V )−Vol(u, v)/ Vol(V ). And for each cluster i ∈ C the delta is separately replaced bya binary vector s i over all vertices with [s i ] v = δ(C(v), i). The contribution of thecluster i ∈ C thus is s T i Ms i. The spectral decomposition of the modularity matrixis M = UDU T using the matrix of eigenvectors U = (u 1 | . . . |u n ) and the diagonalmatrix of eigenvalues D = diag(λ 1 , . . . , λ n ). This is inserted into the modularityformula and the calculation is separated for each eigenvalue:Q = ∑ i∈Cs T i Ms i = ∑ i∈C(U T s i ) T D(U T s i ) = ∑ i∈Cn∑λ j (u T j s i ) 2 (3.11)The previous step exploits that u T j s i = ∑ v∈C iU v,j is a scalar. As next step theeigenvalues shall be moved into sum contained in the squares by using the squareroots√ (λ j ). Unfortunately this is difficult with negative eigenvalues. Thus theeigenvalues and the inner sum are split up into positive and negative eigenvalues.Without loss of generality λ 1 ≥ · · · ≥ λ p > 0 ≥ λ p+1 ≥ · · · ≥ λ n and thus:j=1⎛Q = ∑ p∑⎝ ( √ λ j u T j s i ) 2 −i∈C j=1⎞n∑( √ −λ j u T j s i ) 2 ⎠ = ∑ (||Xi || 2 2 − ||Y i || 2 2)i∈Cj=p+1(3.12)The last step replaced the sums by squared Euclidean norms of the cluster vec-tors X i and Y i . This is achieved by moving the eigenvalues further down into theeigenvectors forming vertex vectors x v = (. . . , √ λ j U v,j , . . .) T with j = 1, . . . , p. Thevertex vectors y v are analogously constructed from all negative eigenvalues and theireigenvectors. The former scalar product with s i is transformed into X i = ∑ v∈C ix vand Y i = ∑ v∈C iy v respectively.Therefore the cluster vectors X i are formed by adding the vectors x v of all verticescontained in the cluster. It becomes visible that the modularity is maximized bygrouping vertices together with maximal X i and minimal Y i lengths. In the vertexvertex vectorx v, y v40
3.3 Merge Selectorsvectors the eigenvectors are weighted with the square-root of their eigenvalues. Thusonly eigenvectors with a large eigenvalue have a significant influence. Moreover itis assumed that just using the positive eigenvalues suffices for the maximization.Thus all negative and weak positive eigenvalues are dropped from now on. Thisis controlled by the two parameters in the list below. The resulting approximatevertex vectors are denoted by ˜x v . An example of the relation between vertex and ˜x v, ˜X vcluster vectors is shown in Figure 3.8.spectral ratio Multiplied with the largest eigenvalue.accepted eigenvalues.Defines the lower limit forspectral max ev The maximal number of eigenvalues and -vectors to compute.Another important property besides the length of the vertex and cluster vectorsis their direction because in the optimum clustering the cluster vectors have to beorthogonal. This follows from the increase of vector length (and modularity) whenmerging two adjacent clusters with vectors which are separated by an angle below90 degree. Thus in m dimensions at most m + 1 clusters are representable. Hencein practice it is necessary to compute as many eigenvectors as expected clusters.Spectral Length and Length Difference From the spectral analysis of the previoussection follows that the modularity is improved by maximizing the length of eachcluster vector ˜X i = ∑ v∈C i ˜x v . The first idea was to select cluster pairs a, b with thelongest vector sum ||X a +X b || 2 . Hence this selection quality is called spectral length.spectral lengthOn second sight this selection quality is related to the modularity increase selectoras was already pointed out by Newman in [57]. The modularity increase in spectraldecomposition is the contribution of the new cluster ||X a + X b || 2 2 − ||Y a + Y b || 2 2 minusthe previous contribution of both clusters ||X a || 2 2 + ||X b|| 2 2 − ||Y a|| 2 2 − ||Y b|| 2 2 . Usingjust the positive eigenvalues results in the spectral length difference selector:2 ˜X T a ˜X b = || ˜X a + ˜X b || 2 2 − (|| ˜X a || 2 2 + || ˜X b || 2 2) (3.13)length differenceThe relation of both selection qualities to the modularity increase selector alsoinherits its disadvantages in the growth behavior. Looking at a long vector anadjacent long vector is ranked higher than all short vectors even when pointing intoa nearly orthogonal direction. Since the vector length roughly correlates with thevertex degree, high-degree vertices will be grouped together early leading to mergeerrors. Measuring the change in vector length instead reduces this domination butis nevertheless similar to the modularity increase selector. It is not directly obviouswhy dropping negative and weak eigenvalues should provide a better insight into theselection problem.Spectral Angle Instead of the misleading vector length the direction of the vectorscan be used. The spectral angle selector prefers pairs of adjacent clusters where thecluster vectors have the same direction, i.e. small angles. The angle is computedusing the cosine:41
Page 1: Brandenburgische Technische Univers
Page 5 and 6: ContentsList of FiguresList of Tabl
Page 7: List of Figures1.1 Graph of the Mex
Page 11 and 12: 1 IntroductionSince the rise of com
Page 13 and 14: 1.2 Objectives and Outline1.2 Objec
Page 15 and 16: 2 Graph ClusteringThis chapter intr
Page 17 and 18: 2.2 The Modularity Measure of Newma
Page 19 and 20: 2.3 Density-Based Clustering Qualit
Page 27 and 28: 2.4 Fundamental Clustering Strategi
Page 35 and 36: 3 The Multi-Level Refinement Algori
Page 37 and 38: 3.1 The Multi-Level Schemeas starti
Page 39 and 40: 3.2 Graph CoarseningData: graph,sel
Page 41 and 42: 3.2 Graph Coarseningnearly no edges
Page 43 and 44: 3.3 Merge SelectorsExtent Name Desc
Page 45 and 46: 3.3 Merge Selectorsdifferent size.
Page 47 and 48: 3.3 Merge SelectorsThe probability
Page 49: 3.3 Merge SelectorsAs selection qua
Page 53 and 54: 3.4 Cluster Refinementleave the loc
Page 55 and 56: 3.4 Cluster Refinementmoving v from
Page 57 and 58: 3.4 Cluster RefinementAlgorithm Sea
Page 59 and 60: 3.4 Cluster RefinementData: graph,c
Page 61 and 62: 3.4 Cluster RefinementModularity0.2
Page 63 and 64: 3.5 Further Implementation NotesInd
Page 65 and 66: 3.5 Further Implementation NotesBOO
Page 67: 3.5 Further Implementation Notesfor
Page 70 and 71: 4 Evaluationparameter component des
Page 72 and 73: 4 Evaluationsignificance scale also
Page 74 and 75: 4 EvaluationModularity by Match Fra
Page 76 and 77: 4 Evaluation5% 10% 30% 50% 100%G-no
Page 78 and 79: 4 Evaluationmean modularity0.50 0.5
Page 80 and 81: 4 Evaluation1 2 3 4RWreach-none 1 0
Page 82 and 83: 4 EvaluationG-none M-none G-sgrd M-
Page 84 and 85: 4 Evaluationmean modularity time DI
Page 86 and 87: 4 EvaluationRuntime vs. Graph SizeR
Page 88 and 89: 4 EvaluationComparison of Modularit
Page 90 and 91: 4 Evaluation(a) karate(b) dolphinsF
Page 92 and 93: 4 Evaluation(a) jazz(b) celegans me
Page 94 and 95: 4 Evaluationadministrators, and gra
Page 97 and 98: 5 Results and Future WorkThe object
Page 99 and 100: 5.3 Directions for Future Workstrat
Page 101:
5.3 Directions for Future Workties
Page 104 and 105:
BIBLIOGRAPHY[14] B.L. Chamberlain.
Page 106 and 107:
BIBLIOGRAPHY[42] H. Jeong, B. Tombo
Page 108 and 109:
BIBLIOGRAPHY[71] A. J. Soper and C.
Page 110 and 111:
A The Benchmark Graph Collectionsub
Page 112 and 113:
B Clustering ResultsRWreach-sgrd 1
Page 114:
B Clustering Resultswalktrap leadin
show all

Multilevel Graph Clustering with Density-Based Quality Measures

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?