# slides - SNAP - Stanford University slides - SNAP - Stanford University

CS 322: (Social and Information) Network Analysis

Jure Leskovec

Stanford University

Want to generate realistic networks:

Given a

real network

Generate a

synthetic y network

Why synthetic graphs?

Compare graphs properties,

eg e.g., degree distribution

Anomaly detection, Simulations, Predictions, Null Null‐

model, Sharing privacy sensitive graphs, …

Q: Which network properties do we care about?

Q: What is a good model and how do we fit it?

11/12/2009

Jure Leskovec, Stanford CS322: Network Analysis 2

Intuition: self‐similarity leads to power‐laws

Try T to t mimic i i recursive i graph h / community it

growth

There are many obvious (but wrong) ways:

Initial graph Recursive expansion

Kronecker Product is a way of generating self‐

similar matrices

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

3

Adjacency matrix

(3x3)

IIntermediate t di t stage t

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

Adjacency matrix

[PKDD ’05]

(9x9)

4

Kronecker product of matrices A and B is given by

N x M K x L

N*K x M*L

We define a Kronecker product of two graphs as

a Kronecker product of their adjacency matrices

11/12/2009

Jure Leskovec, Stanford CS322: Network Analysis

5

[Leskovec et al. PKDD ‘05]

Kronecker graph: a growing sequence of graphs

by b iterating it ti the th Kronecker K k product d t

Each h Kronecker k multiplication l l exponentially ll

increases the size of the graph

Kk has N k

1 nodes and E1

k

k 1 1 edges, so we get

densification

One can easily y use multiple p initiator matrices

(K ’

1 , K1

’’ , K1

’’’ ) that can be of different sizes

11/12/2009

Jure Leskovec, Stanford CS322: Network Analysis

K 1

6

K 1

Continuing multypling with K 1 we

obtain K4 and so on …

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

K 4 adjacency matrix

[PKDD ’05]

7

11/12/2009

Jure Leskovec, Stanford CS322: Network Analysis

[Leskovec et al. PKDD ‘05]

8

Kronecker graphs have many properties

found in real networks:

Properties of static networks

Power‐Law like Degree Distribution

Power‐Law Power Law eigenvalue and eigenvector distribution

Constant Diameter

Properties of dynamic networks:

Densification Power Law

Shrinking/Stabilizing Diameter

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

[PKDD ’05]

9

[PKDD ’05]

Theorem: Constant diameter: If G 1 has

diameter d then graph Gk also has diameter d

Observation: Edges in Kronecker graphs:

where h X are appropriate it nodes d

Example:

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

10

Create N 1N 1N1 1 probability matrix P 1

Compute the kth Kronecker power Pk For each entry y p uv of P k include an edge g (u,v) ( ) with

probability puv Kronecker

multiplication

0.25 0.10 0.10 0.04

Probability

of f edge d pij [PKDD ’05]

0.5 0.2 0.05 0.15 0.02 0.06 Instance

0.1 0.3

0.05 0.02 0.15 0.06 matrix K2 P P1 0.01 0.03 0.03 0.09

P 2=P 1P 1

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

flip biased

coins

11

Initiator matrix K 1 is a similarity matrix

Node N d vi is i described d ib dwith ith k binary bi attributes: tt ib t

v1, v2 ,…, vk Probabilityy of a link between nodes vi, i, vj: j

P(vi,vj) = ∏a K1[vi(a), vj(a)] K1

11/12/2009

0 1

a b

c d

0

1

v2 = (0,1)

v 3 = (1,0)

P(v2,v4) = b·c

Jure Leskovec, Stanford CS322: Network Analysis

=

[Leskovec et al. 09]

12

Given a real network G

Want to estimate initiator matrix:

Method of moments [Owen ‘09] 09]

K1

a b

c d

Compare counts of and solve system

of equations. equations

Maximum likelihood [Leskovec‐Faloutsos ICML ‘07]

arg max P( | G G1) )

SVD [VanLoan‐Pitsianis ‘93]

11/12/2009

Can solve min G K K1

K K1

using SVD

min F

2

Jure Leskovec, Stanford CS322: Network Analysis 13

Maximum likelihood estimation

arg max

G

1

[Leskovec‐Faloutsos ICML ‘07]

P( | K ) 1

Naïve estimation takes O(N!N 2 ):

N! for different node labelings: g

Kronecker

Our solution: Metropolis sampling: N! (big) const

N 2 for traversing graph adjacency matrix

Our solution: Kronecker product (E

Maximum likelihood estimation

Given real graph G

Estimate Kronecker initiator graph Θ (e.g., ) which

arg max P(

G | )

We need to (efficiently) ( y) calculate

P(

G | )

AAnd dmaximize i i over Θ ( (e.g., using i gradient di t ddescent) t)

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

15

Given a ggraph p G and Kronecker matrix Θ we

calculate probability that Θ generated G

P(G|Θ)

0.25 0.10 0.10 0.04

0.5 0.2

0.1 0.3

Θ

0.05 0.15 0.02 0.06

005 0.05 002 0.02 015 0.15 006 0.06

0.01 0.03 0.03 0.09

Θ k

P(G|Θ)

[ICML ’07]

G

1 0 1 1

0 1 0 1

1 0 1 1

1 1 1 1

P ( G | )

[ u u,

v ]

( 1 1

[ u u,

v ])

P k

k

( u,

v)

G

( u,

v)

G

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

G

16

Θ

0.5 0.2

0.1 0.3

1

2

2

1

G’

G”

4

3

3

4

σ

Θ k

0.25 025 0.10 010 0.10 010 0.04 004

0.05 0.15 0.02 0.06

0.05 0.02 0.15 0.06

001 0.01 003 0.03 003 0.03 009 0.09

1 0 1 0

0 1 1 1

1 1 1 1

0 0 1 1

1 0 1 1

0 1 0 1

1 0 1 1

1 1 1 1

P(G’|Θ) = P(G”|Θ)

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

[ICML ’07]

Nodes are unlabeled

Graphs G’ and G” should

have the same probability

P(G’|Θ) P(G |Θ) = P(G”|Θ) P(G |Θ)

One needs to consider all

node correspondences σ

P( G | )

P(

G | ,

) P(

)

All correspondences p are a

priori equally likely

There are O(N!)

correspondences d

17

Assume we solved the correspondence problem

Calculating

[ICML ’07]

P( G | ) [ , ] ( 1

[ , ])

(

( u, v)

G

k u v

( u,

v)

G

k u v

σ… node labeling

Takes O(N2 ) time

Infeasible for large graphs (N ~ 105 Infeasible for large graphs (N 10 )

0.25 0.10 0.10 0.04

005 0.05 015 0.15 002 0.02 006 0.06

0.05 0.02 0.15 0.06

0.01 0.03 0.03 0.09

Θ kc

σ

P(G|Θ, σ)

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

1 0 1 1

0 1 0 1

1 0 1 1

0 0 1 1

G

Part 1‐18

Log‐likelihood

Gradient of log‐likelihood

Sample the permutations from P(σ|G,Θ) and

average the gradients

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

[ICML ’07]

19

Metropolis sampling

Start with a random permutation σ

Do local moves on the permutation

Accept the new permutation σ’ σ

1

2

3

4

If new permutation is better (gives higher likelihood)

else accept with prob. proportional to the ratio of

likelihoods (no need to calculate the normalizing constant!)

1

2

3

Swap node

labels 1 and 4

4

2

4

1 0 1 0

0 1 1 1

1 1 1 1

0 1 1 1

1

2

3

4

3

1

1 1 1 0

1 1 1 0

1 1 1 1

0 0 1 1

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

[ICML ’07]

Can compute efficiently

Only need to account for

changes in 2 rows /

columns

20

Metropolis permutation

sampling algorithm

j

k

Need to efficiently calculate

the likelihood ratios

But the permutations σ (i)

and σ (i+1) and σ only differ at 2

( ) only differ at 2

positions

So we only traverse to

update d 2 rows ( (columns) l ) of f

Θ k

We can evaluate the

likelihood ratio efficiently

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

21

Calculating naively P(G|Θ σ) takes O(N2 Calculating naively P(G|Θ,σ) takes O(N )

Idea:

First calculate likelihood of empty graph, graph a graph

with 0 edges

[ICML ’07]

Correct the likelihood for edges that we observe in

the graph

By exploiting the structure of Kronecker product

we obtain closed form for likelihood of an

empty graph

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

22

We approximate the likelihood:

Empty graph

The sum goes only over the edges

Evaluating P(G|Θ,σ) takes O(E) time

Real graphs are sparse, E

Real ggraphs p are sparse p so we first

calculate likelihood of empty graph

Probability of edge (i,j) is in general pij =θθ a

1 θ2

b

θ3

c

θ4

d

1 θ2 θ3 θ4 By using Taylor approximation to p ij and

summing the multinomial series we

obtain: obta

We approximate the likelihood:

Empty graph

Θ k

p ij =θ 1 a θ2 b θ3 c θ1 d

Taylor approximation

log(1-x) ~ -x – 0.5 x 2

No‐edge likelihood Edge likelihood

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

24

Experimental E i t lsetup t

[ICML ’07]

Given real graph

Stochastic gradient descent from random initial point

Obtain estimated parameters

Generate synthetic graphs

Compare properties of both graphs

WWe ddo not t fit th the properties ti themselves th l

We fit the likelihood and then compare the graph

properties

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

25

Can gradient descent recover true parameters?

How nice (smooth, (smooth without local minima) is optimization

space?

Generate a graph from random parameters

Start at random point and use gradient descent

We recover true parameters 98% of the times

How does algorithm converge to true parameters with

gradient descent iterations?

Log‐likelihood Avg abs error 1 st eigenvalue

Diameter

11/12/2009 Jure Leskovec, Stanford CS322: Network Analysis

26

Real and Kronecker are very close:

[Leskovec‐Faloutsos ICML ‘07]

Real and Kronecker are very close: 1 0.49 0490.13 013

11/12/2009

Jure Leskovec, Stanford CS322: Network Analysis

K

0.99 0.54

27

What do estimated parameters tell us

about the network structure?

K1

11/12/2009

a b

c d

Jure Leskovec, Stanford CS322: Network Analysis

b edges

[Leskovec et al. Arxiv 09]

a edges d edges

c edges

28

What do estimated parameters tell us

about the network structure? K 1

11/12/2009

0.5 edges

Core

09 0.9 edges 0.1 edges

0.5 edges

Periphery

01 edges

Nested Core‐periphery

Jure Leskovec, Stanford CS322: Network Analysis

[Leskovec et al. Arxiv ‘09]

0.9 0.5

0.5 0.1

29

[Leskovec et al. Arxiv ‘09]

Small and large networks net orks are very er different: different

11/12/2009

099 0.99 017 0.17

K = 1 K = 1

0.17 0.82

Jure Leskovec, Stanford CS322: Network Analysis

099054 0.99 0.54

0.49 0.13

30

More magazines by this user
Similar magazines