12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4 Cluster Refinementmoving v from C(v) to jcurrent cluster:f(v, C[v] − v)vother clusters:no changetarget cluster:f(v, C j − v)Figure 3.9: Dependencies when Moving a Vertexcomputing the maxmod ranking costs linear time in the number of incident edgesand the number of clusters. In average this is O(|E|/|V | + |C|).As visible the change of modularity depends through the volume term on theglobal clustering structure and changes <strong>with</strong> nearly each vertex move: The volumeis computed according to the volume model <strong>with</strong> Vol(v, C j − v) = (w(v) − 1)w(C j ).The latter value is the sum over all vertex weights in the cluster. To evaluate thevolume in constant time this value is stored per cluster and iteratively updatedeach time a vertex moves out or into the cluster. This global dependency is the keypoint which makes modularity refinement more expensive then the standard min-cutrefinement.Instead of ranking vertices by the expensive to compute modularity increase goodcandidate vertices may also be identified by their current contribution to the modularityf(v, C[v]) − ρ(V ) Vol(v, C[v]). Vertices <strong>with</strong> a low or negative contributionhave a good chance of not belonging to their current cluster. Moving a vertex doesnot modify the contribution of its self-edge. Therefore omitting self-edges leavesmod(v) = f(v, C[v] − v) − ρ(V ) Vol(v, C[v] − v). This computation can be done inconstant time by storing f(v, C[v] − v) for each vertex and iteratively updating itwhen adjacent vertices are moved.<strong>Based</strong> on the contribution to modularity several vertex rankings were derived. Itcan be assumed that the placement of high-degree vertices has a higher influence onthe modularity. To suppress this effect the modularity contribution can be dividedby its degree. Or to stay consistent <strong>with</strong> the density model, it can be scaled bythe inter-volume between the vertex and its cluster. Since these vertex rankings arestructurally and semantically similar to the vertex fitness used in extremal optimization[23] they are called fitness measures. In contrast to other vertex rankings herelower values are preferred. In summary these are:mod-fitness The modularity contribution mod(v) = f(v, C[v]−v)−ρ(V ) Vol(v, C[v]−v).fitnesseo-fitness The vertex fitness used in extremal optimization eof(v) = mod(v)/ deg(v).density-fitness The density of connections to the current cluster ρ(v, C[v] − v) =f(v, C[v]−v)/V ol(v, C[v]−v). This is ranking equivalent to mod(v)/[deg(v) deg(C[v]−v)] in the degree multiplicity volume model.45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!