12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.1 Methods and Dataof merge selectors. The parameters are grouped into a coarsening and refinementphase depending on what they directly influence. Some merge selectors are furtherconfigurable. Their parameters are placed in the component called selector. Notethat not all parameters are meaningful in all combinations. For example the matchfraction is used only by greedy matching. The last column contains the defaultvalues used throughout the evaluation were nothing different is stated.The parameters controlling spectral merge selectors and the search depth are fixedand their influence will not be further evaluated for following reasons. The vertexvectors used in spectral merge selector always are approximations. Using moreeigenvectors should improve the accuracy, but requires more memory and time tocompute them. Thus the maximal number of eigenvectors is limited to 30. In graphs<strong>with</strong> less than 30 clusters also less eigenvectors contain meaningful information. Byassuming that the positive eigenvalues quickly decay to zero only the few eigenvectors<strong>with</strong> eigenvalues larger than 20% of the largest eigenvalue are used. The searchdepth is only used by the Kernighan-Lin refinement. Previous experiments in Section3.4.3 on refinement algorithms showed that it can be restricted to 20 log 10 |V |moves <strong>with</strong>out missing better clusterings. For safety the factor 25 will be used.4.1.2 EffectivenessThe main tool used in this evaluation is the comparison of mean modularities. Forthis purpose clusterings produced by specific algorithm configurations on a set ofgraphs are collected. The mean modularity of a configuration over the graphs iscalculated using the arithmetic mean.Of course graphs <strong>with</strong> high modularity most influence the absolute value of themean modularity. But the modularity measure is normalized to the range from zeroto one. Thus all modularity values will be in the same range regardless of differinggraph properties. In addition here the absolute values are unimportant as algorithmsand configurations are compared. The differences between mean modularities aremost influenced by graphs <strong>with</strong> high modularity variations between the algorithms.In many places also the results gained <strong>with</strong> and <strong>with</strong>out refinement will be compared.Without refinement it becomes visible how much parameters influence theraw graph coarsening. On the other hand results <strong>with</strong> refinement indicate how wellcoarsening <strong>with</strong> the observed parameters supports the refinement algorithms later.Kernighan-Lin refinement is expected to be more resistant against difficulties as itcan escape from local optima. Hence instead just sorted greedy refinement <strong>with</strong> thedensity-fitness vertex selector is used for this comparisons. This reference configurationwill be abbreviated by sgrd in most places.As a useful side effect the modularity improvements <strong>with</strong> refinement over no refinementprovide a significance scale. Fluctuations of the modularity much smallerthan this scale are probably not of much interest. Similarly improvements of differentcoarsening configurations which are easily compensated by greedy refinementare neither important.For many questions of the evaluation tables containing mean modularities willbe produced. However it is difficult to compare these raw values. Therefore wherepossible the mean modularity results will be plotted and graphically visualized. The61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!