12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.4 Cluster RefinementModularity0.2 0.3 0.4 0.5 0.6●●● ●●● ●● ●●●●●●●●●●●●●●● ● ●●● ● ●● ●●●peak●●●● ●Modularity0.54 0.56 0.58 0.60 0.62 0.64●peak● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●20 40 60 80 100 120Clusters0 500 1000 1500 2000Clusters(a) Modularity and Cluster Count4000 6000 8000 10000 12000 16000Move(b) Dynamics of Kernighan-Lin RefinementFigure 3.13: Kernighan-Lin Refinement Creating Clusters on <strong>Graph</strong> Epa mainmodularity and the number of clusters at the end of inner loops. In addition theposition of peak clusterings is shown by blue circles. Moreover Figure 3.13b showsthe beginning of multi-level Kernighan-Lin refinement in the same graph. The blackline plots the modularity and the red line the number of clusters. Again blue circlesmark cluster counts at peak clusterings. Both graphics underline that all peakclusterings are found <strong>with</strong> a small number of clusters while the algorithm tends tocreate a huge number of clusters.This behavior can be explained as follows: After all vertices fitting well into anothercluster are moved the algorithm continues <strong>with</strong> vertices very well connectedto their current cluster. Often moving them into a new cluster decreases the modularityless than moving them into other existing clusters. Thus a cluster is createdfor that vertex.The excessive creation of clusters brings two big problems: Firstly the searchfor better clusterings is quickly directed to low-modularity clusterings and exploresonly few <strong>with</strong> cluster counts near the optimum. This wastes time in uninterestingparts of the clustering space. But more importantly the time complexity of thewhole algorithm is coupled to the number of clusters. Thus high cluster countssignificantly impair the performance.Effective Search Depth The moves performed by the algorithm are interpretableas a depth-first search into the space of possible clusterings <strong>with</strong> the vertex andtarget ranking determining the direction. The complete execution of the inner loopis very time consuming. However already after a small number of moves the searchis far away from optimal clusterings and cannot return back because of the markedvertices.In order to safely abort the inner loop earlier the number of moves between peakclusterings is analyzed. For this purpose let the effective search depth be the max-51

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!