12.07.2015 Views

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

Multilevel Graph Clustering with Density-Based Quality Measures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1 IntroductionSince the rise of computers and related information technologies it became easy tocollect vast amounts of data. Combined <strong>with</strong> technological advances in linking persons,discussions, and documents, various relational information is readily availablefor analysis. These networks are able to provide insight into the function of wholesocieties. Thus nowadays network analysis is an important tool also outside scientificcommunities, for example in market analysis and politics.The exploration and utilization of collected data requires elaborate analysis strategiessupported by computational methods. A common abstraction for relational networksare graphs. In these entities like persons are represented by vertices. <strong>Graph</strong>sdescribe the observed relationships between entities by edges connecting related vertices.In addition the edges may be attributed by weights to describe the strengthof connections.<strong>Graph</strong> clustering is the problem and process of grouping vertices together intodisjoint sets called clusters. Discovering such clusters allows to analyze the underlyingshared properties in and differences between groups. Certainly many differentclusterings are possible. In order to find practically useful ones a quality measure isnecessary. This measure leads optimization algorithms to interesting clusterings. Afew years ago Mark Newman introduced a specific quality measure called modularity[60]. It was developed for the analysis of social networks but also proved to beuseful in other scientific areas.1.1 An Introductory ExampleAn example drawing of such a network is given in Figure 1.1. It is based on astudy of Jorge Gil-Mendieta and Samuel Schmidt [30] about the network of influentialMexican politicians between 1910 and 1994. In the graph each politician isrepresented by a vertex and edges connect pairs of politicians based on friendships,business partnerships, marriages, and similar. In the figure vertices are displayedas boxes and circles <strong>with</strong> a size proportional to their number of connections. Theplacement of the vertices was computed <strong>with</strong> the LinLog energy model [62] andedges are drawn as straight lines connecting related vertices.In the figure the vertices are colored according to a clustering of the politicians.<strong>Based</strong> on optimizing the modularity measure, the shown clustering was automaticallyconstructed by the algorithm developed in this work. Three major clusters arevisible. These roughly corresponding to three generational changes and differencesin the military vs. civil background. Similar groups were also discovered by theGil-Mendieta and Schmidt, who manually derived groups by studying the importanceand influence of single persons to the network using various special purposecentrality indices.1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!