12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3 The Multi-Level Refinement AlgorithmModularity0.646 0.647 0.648 0.649 0.650 0.651 0.652 0.653peak12 7430 35 40 45ClustersObserved Search Depth0 20 40 60 80●●●●●●● ●● ●●●● ●●●●● ●●●● ●●● ●● ● ●●● ● ●●● ●●●●●● ●●● ●● ●● ● ●● ●●●● ●● ● ●● ●● ●● ● ●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●0 50 100 150Move(a) Detail of the Dynamics on graph Epa main10 50 100 500 1000 5000Vertices(b) Observed Effective Search DepthFigure 3.14: Effective Search Depth of Kernighan-Lin Refinementimal number of moves from a peak clustering to a clustering <strong>with</strong> equal or bettermodularity that may occur in real-world graphs. Of course this depth depends onthe number of vertices.A typical situation <strong>with</strong> about 1500 vertices is displayed in Figure 3.14a. Themodularity of the clusterings is shown by the black line. Blue vertical lines markthe moves where peak clusterings were found. At the beginning of the inner loopall 12 moves directly increasing modularity were executed. From this first peakclustering a series of 62 modularity decreasing and increasing moves were necessaryto find better clusterings and the second peak. In example this represents an effectivesearch depth of around 60 moves.In order to find a simple lower bound for the effective search depth the basic algorithmwas applied <strong>with</strong> multi-level refinement to a selection of 17 real-world graphs.At each peak clusterings the observed search depth together <strong>with</strong> the number of verticesat the current coarsening level was recorded. The scatter plot in Figure 3.14bvisualizes the dependency between vertex count and search depth. A logarithmicscale is used for the vertex count. It is visible that at extreme values the depthnearly linearly grows <strong>with</strong> the logarithm of the vertex count. Now a lower bound isobtained by taking the maximal quotient. This yielded a factor of about 20. Thusit is moderately safe to abort the inner loop after around 20 log 10 |V | moves <strong>with</strong> amodularity below the last peak clustering. This bound is also shown by the dashedline.<strong>Based</strong> on these observations the basic algorithm is improved by restricting thesearch depth. The parameter search depth controls the accepted search depth. Theinner loop is terminated after log 10 |V | times search depth moves <strong>with</strong> a modularitylower than at the last peak clustering or beginning of the inner loop. Factors around20 should be used based on the measurements. Due to this early termination it isunnecessary to also restrict the creation of clusters.52

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!