12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 <strong>Graph</strong> <strong>Clustering</strong>This chapter introduces into graph clustering <strong>with</strong> the focus on Newman’s modularityas optimization aim. It should provide the necessary foundations to developand compare clustering algorithms. For this purpose a mathematical analysis of themodularity and related models is as necessary as the study of existing algorithms.This chapter is organized as follows. First graphs and clusterings are formallydefined together <strong>with</strong> helpful mathematical notations. The second section introducesthe modularity measure in the form used throughout this work. In order tosubstantiate its relationship to other quality measures the third sections presents anaxiomatic derivation from the density concept. The chapter concludes <strong>with</strong> a reviewand summary of fundamental clustering strategies.2.1 Basic DefinitionsThis section formally introduces graphs, weights and clusterings. In the first subsectiondifferent types of graphs are discussed. Undirected graphs will be used in thederivation of quality measures while symmetric, directed graphs are used as internalrepresentation in the implementation. The second subsection on attributed graphsdefines vertex and edge properties and basic computations on edge weights. Thefinal subsection formally introduces clusterings and quality measures.2.1.1 <strong>Graph</strong>sAn undirected graph G = (V, E) is a set of vertices V combined <strong>with</strong> a set of edgesE. Two vertices u, v ∈ V are said to be adjacent, if they are connected by an edgee ∈ E. In that case, u and v are called end-vertices of the edge e and the edge isincident to both vertices. Each edge connects at most two different vertices and iscalled self-edge if it connects a single vertex just to itself.Let E(X, Y ) ⊆ E be the subset of edges connecting vertices in X ⊆ V <strong>with</strong>vertices in Y ⊆ V in an undirected graph. The edges incident to vertex v are E(v) :=E({v}, V ) and two vertices u, v are adjacent if E(u, v) is non-empty. Conversely letV (e) ⊆ V be the set of end-vertices incident to edge e ∈ E. For example theneighborhood of a vertex is the set of all adjacent vertices including itself and isexpressed by N(v) := V (E(v)) ∪ {v}. The subgraph induced by vertices X ⊆ V isG[X] := (X, E(X, X)). A graph is bipartite if the vertices can be dissected into twodisjoint sets A, B and all edges lie between both sets, i.e. E(A, B) = E.In directed graphs all edges are oriented and connect a start-vertex <strong>with</strong> an end-vertex. A directed graph is symmetric if for each edge also an edge in the opposingdirection exists. In that case the graph topology is equal to an undirected graph. Insymmetric, directed graphs for each vertex an enumeration of its incident edges isundirected graphE(X, Y ), V (e),N(v)directed, symmetric5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!