08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

present and ask how correlated this percentage is with the presences or absence <strong>of</strong> edges<br />

at some distance d. One is interested in whether the correlation drops <strong>of</strong>f with distance.<br />

To explore this concept we consider the Ising model studied in physics.<br />

The Ising or ferromagnetic model is a pairwise random Markov field. The underlying<br />

graph, usually a lattice, assigns a value <strong>of</strong> ±1, called spin, to the variable at each vertex.<br />

The probability (Gibbs measure) <strong>of</strong> a given configuration <strong>of</strong> spins is proportional<br />

to exp(β<br />

∑ x i x j ) =<br />

∏ e βx ix j<br />

where x i = ±1 is the value associated with vertex i.<br />

Thus<br />

(i,j)∈E<br />

(i,j)∈E<br />

p (x 1 , x 2 , . . . , x n ) = 1 Z<br />

where Z is a normalization constant.<br />

∏<br />

(i,j)∈E<br />

exp(βx i x j ) = 1 Z e β ∑<br />

x i x j<br />

(i,j)∈E<br />

The value <strong>of</strong> the summation is simply the difference in the number <strong>of</strong> edges whose<br />

vertices have the same spin minus the number <strong>of</strong> edges whose vertices have opposite spin.<br />

The constant β is viewed as inverse temperature. High temperature corresponds to a low<br />

value <strong>of</strong> β and low temperature corresponds to a high value <strong>of</strong> β. At high temperature,<br />

low β, the spins <strong>of</strong> adjacent vertices are uncorrelated whereas at low temperature adjacent<br />

vertices have identical spins. The reason for this is that the probability <strong>of</strong> a configuration<br />

is proportional to e β ∑ x i x j.<br />

i∼j As β is increased, for configurations with a large number <strong>of</strong><br />

edges whose vertices have identical spins, e β ∑ x i x j<br />

i∼j increases more than for configurations<br />

whose edges have vertices with non identical spins. When the normalization constant 1 Z<br />

is adjusted for the new value <strong>of</strong> β, the highest probability configurations are those where<br />

adjacent vertices have identical spins.<br />

Given the above probability distribution, what is the correlation between two variables<br />

x i and x j . To answer this question, consider the probability that x i equals plus one as a<br />

function <strong>of</strong> the probability that x j equals plus one. If the probability that x i equals plus<br />

one is 1 2 independent <strong>of</strong> the value <strong>of</strong> the probability that x j equals plus one, we say the<br />

values are uncorrelated.<br />

Consider the special case where the graph G is a tree. In this case a phase transition<br />

occurs at β 0 = 1 d+1<br />

ln where d is the degree <strong>of</strong> the tree. For a sufficiently tall tree and for<br />

2 d−1<br />

β > β 0 , the probability that the root has value +1 is bounded away from 1 / 2 and depends<br />

on whether the majority <strong>of</strong> leaves have value +1 or -1. For β < β 0 the probability that<br />

the root has value +1 is 1 / 2 independent <strong>of</strong> the values at the leaves <strong>of</strong> the tree.<br />

Consider a height one tree <strong>of</strong> degree d. If i <strong>of</strong> the leaves have spin +1 and d − i have<br />

spin -1, then the probability <strong>of</strong> the root having spin +1 is proportional to<br />

e iβ−(d−i)β = e (2i−d)β .<br />

323

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!