11.07.2015 Views

Upgrade Report - Department of Informatics - King's College London

Upgrade Report - Department of Informatics - King's College London

Upgrade Report - Department of Informatics - King's College London

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 NETWORK PROPERTIES AND METRICS 6Where a ij is the entry (i, j) in the adjacency matrix <strong>of</strong> the graph and α(V c ) is the edges incident to V c . Theconductance <strong>of</strong> a graph G is defined as:φ G = minV c⊂V|V c|≤ |V |2φ(V c ) (4.7)4.6 Communities: Centrality and modularityAs suggested by the work <strong>of</strong> M. Girvan et al [26] we can detect community structures using some alternativenetwork measures such as betweenness centrality, closeness centrality etc . The idea is that we partitionthe graph by removing edges with high betweenness centrality, where the betweenness centrality C B is ameasurement <strong>of</strong> how many optimal paths go through a specific vertex/edge, more formally:C B (u) =∑s≠u≠t∈Vσ st (u)σ st(4.8)Where σ st is the number <strong>of</strong> shortest paths from s to t, and σ st (u) is the number <strong>of</strong> shortest paths from s tot that pass through a vertex u.There is also a measure which has been suggested by J. Newman et al [47] called modularity which givesa quantitative measurement <strong>of</strong> the quality <strong>of</strong> a given community partitioning <strong>of</strong> a graph. This measureeffectively compares the number <strong>of</strong> edges which are present within a given partition <strong>of</strong> the graph with theexpected number <strong>of</strong> edges that would be found in a random graph with the same number <strong>of</strong> vertices andedges.Its worth mentioning the textbook measurement <strong>of</strong> modularity as defined in [47] is shown in Equation 4.9.Q = ∑ i(e ii − a 2 i ) (4.9)a i = ∑ je ijIn the above, a i indicates the fraction <strong>of</strong> edges that connect to vertices in community i and e ii the fraction<strong>of</strong> edges which are within a community i (i.e. with a source and target within community i). As statedin the literature, in a network where the number <strong>of</strong> within-community edges is no better than the number<strong>of</strong> within community edges in a graph with the same vertices and community split but random edges thenthe modularity is 0 while values approaching 1.0 indicate a very strong community structure. In practicewe will consider values over 0.3 to be significantly higher than the expected values, denoting a communitystructure.The problem with utilizing the modularity measurement or the partitioning in general is that findingthe optimal separation is an NP-complete problem. Recent work on approximation techniques [6] hasmade the estimation <strong>of</strong> near optimal communities feasible even on very large graphs. This was achievedby attempting to iteratively optimize the modularity <strong>of</strong> a given partition by moving vertices into othercommunities, effectively producing near optimal partitioning in polynomial time.4.7 Communities: Overlapping communitiesIn related work by N. Mishra et al [44] it was suggested that the above criteria are not necessarily sufficient toeffectively represent the community structure which is apparent in OSNs. In their work it was suggested thatcommunities need not be disjoint and may, in fact, be heavily overlapping. We find this to be intuitivelyreasonable due to the fact that an average user <strong>of</strong> such a network has an array <strong>of</strong> interests, each onebelonging to a different category (e.g. sports, politics, region, religion) and in fact OSN users belong tomultiple communities based on their interests. They introduce the concept <strong>of</strong> (α − β)-communities whichare formally defined as follows:Definition 1. Given a graph G = {V, E} in which every vertex has a self-loop. C ⊂ V is an (α − β)-clusterif it is:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!