12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3 The Multi-Level Refinement Algorithm11mergeedges2a11bmoveedgeab11newself-edgeFigure 3.5: Merging two Vertices and their Edgesselector informs the heap about the new neighbor and selection quality and thevertex is moved up and down through the heap into its correct position. Empirictests suggested that the runtime is already acceptable <strong>with</strong>out the heap. Thus it isdisabled by default to reduce possible sources of error and regression.A temporary coarse graph is maintained to store edge weights and adjacencyinformation for the merge selector. In this graph each cluster is represented by avertex. Merging two clusters equals merging their vertices and incident edges. Thisis known as edge contraction and is the most expensive operation of each merge stepbecause the incident edges of both vertices have to be processed. Let the two involvedvertices be the to be removed and the target vertex. For example in Figure 3.5 thevertex b is to be removed and vertex a is the target. Edges to a common neighborare merged by simply dropping the superfluous edge of the removed vertex. Edgesto a vertex not adjacent to the target vertex have to be moved to it.The merge data structures proposed by Wakita and Tsurumi [76] were implemented.The outgoing edges of each vertex are stored in a double-linked list sortedby the end-vertex index. Now both lists can be merged in one pass <strong>with</strong>out searchingthe matching out-edges of the target vertex. Some special cases arise however:Updating self-edges is difficult because their position is not predictable. Thus theyare always stored as the first out-edge. Unfortunately the original paper leaves openhow to correct the position of moved edges in the list of their end-vertices. Thelists should be sorted by the end-vertex index but moving an edge changes the indexfrom the removed to the target vertex. The simplest fix is to reinsert such edges intothe list, which costs linear time in the size of the list. Therefore worst-case runtimeO(|V | 2 ) of this coarsening method is reached in case all |V | edges of the removedvertex have to be moved to the target vertex <strong>with</strong>out merging edges.The overall worst case time complexity for a complete merge step is O(|V | 2 ):Selecting the best pair costs O(|V |) <strong>with</strong> a linear search or O(log |V |) using theheap. The two clusters are merged in constant time using union-find. Combiningthe edge lists requires quadratic time O(|V | 2 ) in worst case. Updating the bestneighborinformation also requires a linear search at each adjacent vertex in worstcase. This adds quadratic time O(|V | 2 ). In addition updating the best pair heapmay cost up to O(|V | log |V |). However in sparse graphs the number of incidentedges often is much smaller than the total number of vertices. And in dense graphs30

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!