slides - SNAP - Stanford University

snap.stanford.edu

slides - SNAP - Stanford University

CS224W: Social and Information Network Analysis

Jure Leskovec Stanford University

Jure Leskovec, Stanford University

http://cs224w.stanford.edu


Power Power‐law law degree

distributions

How do power‐law p degree g

networks look like?

Random network

(Erdos‐Renyi random graph)

Scale‐free (power‐law)

network

Function is

scale free if:

f(ax) = c f(x)

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2


In Preferential Attachment model power power‐law law

degrees naturally emerge [Albert‐Barabasi ‘99]

Nodes arrive in order

A new node j creates m out‐links

Prob. of linking g to a node i is proportional p p to its

degree di: P( j i)

d

i

Note: Pref Pref. Attachment is not the only model to

generate power‐law networks

What are other mechanisms giving power‐law

degree networks?

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3


Preferential

attachment:

Power‐law Power law

degree

distributions

But no local

clustering

Can we get

multiple

properties?

Node degrees:

Clustering coefficient:

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4


Preferential attachment is a model of a

growing network

What governs the network

growth and evolution?

P1) Node arrival process:

When nodes enter the network

P2) Edge initiation process:

Each node decides when to initiate an edge

P3) Edge destination process:

The node determines destination of the edge

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5


(F)

(D)

(A)

(L)

4 online social networks with

exact edge arrival sequence

For every edge (u,v) we know exact

time of the appearance tuv Directly observe mechanisms leading

to global network properties

[Leskovec et al. KDD 08]

and so on for

millions…

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6


(F) (D)

Flickr:

Exponential

(A) (L)

Delicious:

Linear

Answers:

LinkedIn:

SSub‐linear b li

QQuadratic d ti

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

7


How long do nodes live?

Node life‐time is the time between the 1st and the

last edge of a node

How often nodes “wake wake up” up to create edges?

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8


LinkedIn Li k dI

Lifetime a: a

time between

node’s first

and d llast tedge d

Node lifetime is exponential: p(a) = λ exp(‐λa)

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9


How often nodes “wake wake up up” to create edges?

Edge gap δ(d): time between dth and d+1st edge

of a node:

Let ti(d) be the creation time of d‐th edge of node i

δ δi(d) i(d) = t ti(d+1) i(d ) ‐ t ti(d) i(d)

Then δ(d) is a distribution (histogram) of δ i(d) over

all nodes i

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10


p g

(


LinkedIn

;

,


)


e

Edge gap δ(d):

inter‐arrival

time between

d th and d+1 st

edge

For every d we get

a different plot p


10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11


As the degree of the node degree increases, increases

how α and β change?

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12


α is const, β linear in d – gaps get smaller with d

Probabilit P ty

p

g

d=3 d=2

Edge gap

( ; ,

,

d)


Degree

d=1

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu


e


d

13


Source node i wakes up and creates an edge

How does i select a target node j?

What is the degree of the target j?

Do preferential attachment really hold?

How many hops away if the target j?

Are edges attaching locally?

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14


[w/ Backstrom‐Kumar‐Tomkins, KDD ’08]

Are edges more likely likel to connect to higher

degree nodes?

G np

PA

Flickr

p e

( k)


Network τ

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

k


G Gnp 0

PA 1

Flickr 1

Delicious 1

Answers 0.9

LinkedIn 0.6

15


[w/ Backstrom‐Kumar‐Tomkins, KDD ’08]

Just before the edge (u,w) (uw) is placed how many

hops is between u and w?

G np

PA

Fli Flickr k

Real edges are local local.

Most of them close triangles!

Fraction of triad

closing edges

Network % Δ

Flickr 66%

Delicious 28%

Answers 23%

LinkedIn kd 50%

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

u

v

w

16


New triad‐closing triad closing edge (u,w) (uw) appears next

We model this as:

11. Ch Choose u’sneighbor ’ ihb v u

v’

v

w

2. Choose v’s neighbor w

3. Connect ( (u,w) )

Compute edge prob. under Random‐

RRandom: d p(u,w) ( ) =

“S “Score” ” of f a graph h = p(u,w) ( )

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

17


ode)

t w (2 nd no

Select

Improvement Impro ement over oerthe the baseline: baseline

Strategies to pick a neighbor:

random: uniformly at random

deg: proportional to its degree

Strategy to select v (1 st node)

com: prop. to the number of common friends u

last: prop. to time since last activity

comlast: prop. to com*last

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

18

v

w


[w/ Backstrom‐Kumar‐Tomkins, KDD ’08]

Theorem: Exponential node lifetimes and

power‐law with exponential cutoff edge gaps

lead to power‐law degree distributions

Interesting as temporal behavior predicts

structural network property

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19


[w/ Backstrom‐Kumar‐Tomkins, KDD ’08]

Node lifetime: p l() l(a) =

Node of life‐time a, what is its final degree D?

What is distribution of D as a func. of ,,?

The 2 exp funcs “cancel”. Power‐law survives

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20


The model of network evolution

Process Model

P1) Node arrival • Node arrival function is given

P2) Edge initiation

P3) 3) Edge g destination

• Node lifetime is exponential

• Edge gaps get smaller as the

d degree increases i

Pick edge destination using

random‐random

d d

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

21


Given the model one can take an existing

network continue its evolution

Compare true and predicted degree

exponent: p

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22


How do networks evolve at the macro level?

What are global phenomena of network growth?

Questions:

What is the relation between the number of nodes

n(t) and number of edges e(t) over time t?

How does diameter change g as the network grows? g

How does degree distribution evolve as the

network grows?

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23


N(t) … nodes at time t

E(t) … edges at time t

Suppose that

N(t+1) = 2 * N(t)

Q: what is

E(t+1) =

AA: over‐doubled! d bl d!

But obeying the Densification Power Law

[Leskovec et al. KDD 05]

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24


[w/ Kleinberg‐Faloutsos, KDD ’05]

What is the relation between

the number of nodes and the

Internet

edges over time? a=1.2

Prior work assumes: constant

average degree over time

Networks are denser over time

Densification Power Law:

E(t)

E(t)

Citations

N(t)

a … densification exponent (1 ≤ a ≤ 2) N(t)

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

a=1.6

25


Densification Power Law

[Leskovec et al. KDD 05]

the number of edges grows faster than the

number of nodes – average g degree g is increasing g

or

equivalently

a … densification exponent: 1 ≤ a ≤ 2:

a=1: linear growth – constant out‐degree

(traditionally assumed)

a=2: quadratic growth – clique

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26


Prior models and intuition say

that the network diameter slowly

grows (like log N, log log N)

Diameter shrinks over time

as the network grows the

di distances t bt between th the nodes d

slowly decrease

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

diameter d

diametter

[w/ Kleinberg‐Faloutsos, KDD ’05]

Internet

si size e of the graph

Citations

time

27


Is shrinking

diameter just j a

consequence of

densification?

diammeter

[Leskovec et al. TKDD 07]

Erdos‐Renyi

random graph

Densification

exponent p a =1.3

size of the graph

Densifying random graph has increasing

diameterThere diameterThere is more to shrinking diameter

than just densification

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

28


Is it the degree sequence?

Compare diameter of a:

True network (red)

Random network with

diameeter

[Leskovec et al. TKDD 07]

Citations Cit ti

the same degree

distribution (blue) size of the graph

Densification + degree sequence

give shrinking h k diameter d

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29


[Leskovec et al. TKDD 07]

How does degree distribution evolve to allow

for densification?

Option 1) Degree exponent is constant:

Fact 1: For degree exponent 1< < 2: a = 2/

Email network

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30


[Leskovec et al. TKDD 07]

How does degree distribution evolve to allow

for densification?

Option 2) Exponent n evolves with graph size n:

Fact 2:

Citation network

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31


[Leskovec et al. TKDD 07]

Let’s assume the

community structure

University

One expects many

within‐group

Science Arts

friendships and fewer

cross‐group ones

CS Math Drama Music

How hard is it to

cross communities? Self‐similar university

community it structure t t

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32


Assume the cross‐community cross community linking

probability of nodes at tree‐distance h is:

where: c ≥ 1 … the Difficulty constant

h … tree‐distance

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 33


n = 2k n 2 nodes reside in the leaves of the bb‐way way

community hierarchy (assume b=2)

Each node then independently creates edges

based the community hierarchy: f(h)=c-h How many edges m are in a graph of n nodes?

Community tree evolves by a complete new level of

nodes being added in each time step

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 34


[Leskovec et al. TKDD 07]

Claim: l Community Guided ddAttachment h graph h

model, the expected out‐degree of a node is

proportional i lto

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 35


[Leskovec et al. TKDD 07]

What is the link prob.: p(u v)=c-h(u,v) What is the link prob.: p(u,v) c

What is expected out‐degree of a node x?

How many nodes are at distance h?

AAnalyze l separate t cases:

Can also generalize the model

to get power‐law degrees and

densification [see TKDD 07]

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 36


Claim: The Community Guided Attachment

leads to Densification Power Law with

exponent: p

a … ddensification ifi ti exponent t

b … community tree branching factor

c … difficulty constant constant, 1 ≤ c ≤ b

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 37


DPL:

Gives any non‐integer Densification exponent

If c = 1: easy to cross communities

Then: a=2 a=2, quadratic growth of edges – near

clique

If c = b: hard to cross communities

Then: a=1, linear growth of edges –constant out‐

degree

10/27/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 38


[Leskovec et al. TKDD 07]

But, , we do not want to have explicit p communities

Want to model graphs that density and have

shrinking diameters

Intuition:

How do we meet friends at a party?

HHow do d we identify id tif references f when h writing iti papers? ?

w

v

10/27/2010 39


The Forest Fire model has 2 parameters:

p … forward burning probability

r … backward burning probability

The h model: dl

10/27/2010

Each turn a new node v arrives

Uniformly at random chooses an

“ambassador” w

Flip 2 geometric coins to determine the

number b of f iin‐ and d out‐links t li k of f w tto ffollow ll

Fire spreads recursively until it dies

New node v links to all burned nodes

[Leskovec et al. TKDD 07]

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 40


E(t)

Forest Fire generates graphs that densify and

have shrinking diameter

10/27/2010

densification diameter

1.32

meter

diam

N(t) N(t)

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 41


Forest Fire also generates graphs with Power‐ Power

Law degree distribution

iin‐degree d out‐degree t d

log count vs. log in-degree log count vs. log out-degree

10/27/2010

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 42


Fix backward

probability b bilit r and d

vary forward

burning g probability p y

p

Notice a sharp p

transition between

sparse and clique‐

like graphs

Sweet spot is very

narrow

10/27/2010

Increasing

diameter

Sparse

graph

Clique‐like

graph

Constant

di diameter t

Decreasing

ddiameter

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 43

More magazines by this user
Similar magazines