12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2 <strong>Graph</strong> <strong>Clustering</strong>the connection density is the proportion of actual edges (mass) to ”possible” edges(volume). The actual edges are measured by the sum of their weights f(u, v) and the”possible” edge weight is given by the reference graph (V, Vol(u, v)) called volumemodel from hereon. Hence the connection density is defined as:Vol(u, v)f(A, B)ρ(A, B) :=Vol(A, B)ρ(A) :=f(A)Vol(A)(2.10)(2.11)As only undirected graphs are considered, it is reasonable to require the volumeto be symmetric Vol(u, v) = Vol(v, u) and positive Vol(u, v) ≥ 0. Because edgescan also be expected where no real edges are, the volume weights are defined on allvertex pairs. The volume has to be non-zero where the edge weight is above zero.2.3.2 Bias and the Null Modelf 0(u, v)Given graph (V, f) an uniform density graph (V, f 0 ) is derived by evenly distributingthe global density over all vertex pairs using ρ(V ) = f 0 (u, v)/ Vol(u, v). The uniformdensity weights f 0 (u, v) = ρ(V ) Vol(u, v) are called null model. The scaling by ρ(V )transforms the volume into average edge weights.A quality measure is called biased when in it does not assign the same qualityto all clusterings of the graph (V, f 0 ). This bias is unfavorable because it signalsstructure where no structure is expected. In reverse a quality measure is unbiasedif applying it to the null model yields a constant quality.The bias can be removed from a quality measure by either dividing by its nullmodel quality (scaling) or subtracting the null model quality (shifting) [28, 62]. LetQ f0 (C) be the quality of clustering C in the null model. Then unbiased measures arederived like in the equations below. In uniform density graphs the scaled measurehas the same quality 1.0 for all clusterings and shifting yields constant zero quality.2.3.3 Derived <strong>Quality</strong> <strong>Measures</strong>Q scaled (C) := Q(C)/Q f0 (C) (2.12)Q shifted (C) := Q(C) − Q f0 (C) (2.13)Reformulating the first clustering aim in terms of density, vertex partitions <strong>with</strong>low connection density between clusters and high inner density are searched. Thisaim may be approached directly by maximizing the difference or the ratio betweenintra-cluster density and inter-cluster density:Q diff (C) :=Q ratio (C) :=∑∑ i∈C f(C i)i∈C Vol(C i) −∑∑ i∈C f(C i)i∈C Vol(C i)∑i≠j∈C f(C i, C j )∑i≠j∈C Vol(C i, C j )( ∑i≠j∈C f(C i, C j )∑i≠j∈C Vol(C i, C j )(2.14)) −1(2.15)10

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!