12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5 Results and Future Work5.3.2 Study of Merge SelectorsDuring the course of this work a measure for the prediction quality of merge selectorswas developed. It counts the percentage of the vertex pairs having a high selectionquality that would actually select a pair of vertices in the same final cluster. Forthis purpose a reference clustering is necessary.However it turned out that this prediction quality alone does not much about thegeneral performance of a merge selector. In practice it is also important to studywhen and where coarsening errors, i.e. putting together vertices of different clusters,occur. For example it might be that the spectral angle merge selector performs wellon big, fine grained graphs but just fails on the coarser coarsening levels. On the finelevels all information is non-locally stretched while the coarsening joins informationabout the structure and may produce good local information later on. In that casethe weight density selector should be used on the coarser levels instead.5.3.3 Linear ProgrammingIt is possible to formulate the clustering problem as linear or quadratic program [13,2]. Instead of classic rounding techniques the computed distances could be used asmerge selection quality. This would enable multi-level refinement on the roundingresults.The fractional linear program is solvable in polynomial time but requires |V | 2space for the distance matrix and cubic space for the transitivity constraints. Thusit becomes impracticable already for medium-sized graphs unless the constraintsare replaced by a more compact implicit representation. However for the presentedmulti-level refinement approximate distances between adjacent vertices would suffice.Maybe such approximations could be faster computed using other representationsof the optimization aim. For example in [6] an embedding into higher-dimensionalunit spheres under square Euclidean norm was used for similar quality measures.5.3.4 Multi-Pass <strong>Clustering</strong> and RandomizationA meta-strategy similar to evolutionary search [72] is multi-pass clustering. Becausethe refinement corrects coarsening errors the computed clustering contains valuableinformation. In a second pass this can be fed back into the coarsening by ignoringall vertex pairs crossing previous cluster boundaries. This effectively producesa corrected coarsening hierarchy and allows further improvements by refinementheuristics. Coarsening and refinement are repeated until the clustering does notfurther improve. The multi-pass search may be widened by applying some kind ofrandomization during the coarsening.5.3.5 High-Level Refinement SearchIn the domain of cluster refinement several improvements might be possible. Forexample restarting the Kernighan-Lin search on intermediate clusterings allows toimprove the search depth. In this context a slight randomization would help breaking90

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!