08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

p(a) = 1 2<br />

p(b) = 1 4<br />

p(c) = 1 8<br />

p(d) = 1 8<br />

1<br />

2<br />

a<br />

d<br />

1<br />

8<br />

1<br />

4<br />

b<br />

c<br />

1<br />

8<br />

1<br />

a → b<br />

3<br />

1<br />

a → c<br />

3<br />

1 2<br />

4<br />

1 2<br />

8<br />

1<br />

8<br />

1 = 1 c → a 1 6 3<br />

1 = 1 1<br />

c → b<br />

12 3<br />

2<br />

1 = 1 1<br />

c → d<br />

12 3<br />

a → d 1 3<br />

a → a 1− 1 6 − 1<br />

12 − 1 12 = 2 c → c 1− 1 3 3 − 1 3 − 1 3 = 0<br />

1<br />

b → a d → a 1 3 3<br />

1 1 4<br />

b → c<br />

3 8 1 = 1 1<br />

d → c<br />

6 3<br />

b → b 1− 1 3 − 1 6 = 1 d → d 1− 1 2 3 − 1 3 = 1 3<br />

p(a) = p(a)p(a → a) + p(b)p(b → a) + p(c)p(c → a) + p(d)p(d → a)<br />

= 1 2<br />

2<br />

3 + 1 4<br />

1<br />

3 + 1 8<br />

1<br />

3 + 1 8<br />

1<br />

3 = 1 2<br />

p(b) = p(a)p(a → b) + p(b)p(b → b) + p(c)p(c → b)<br />

= 1 2<br />

1<br />

6 + 1 4<br />

1<br />

2 + 1 8<br />

1<br />

3 = 1 4<br />

p(c) = p(a)p(a → c) + p(b)p(b → c) + p(c)p(c → c) + p(d)p(d → c)<br />

= 1 1<br />

+ 1 1<br />

+ 1 0 + 1 1<br />

= 1<br />

2 12 4 6 8 8 3 8<br />

p(d) = p(a)p(a → d) + p(c)p(c → d) + p(d)p(d → d)<br />

= 1 2<br />

1<br />

12 + 1 8<br />

1<br />

3 + 1 8<br />

1<br />

3 = 1 8<br />

Figure 5.2: Using the Metropolis-Hasting algorithm to set probabilities for a random walk<br />

so that the stationary probability will be the desired probability.<br />

Example: Consider the graph in Figure 5.2. Using the Metropolis-Hasting algorithm,<br />

assign transition probabilities so that the stationary probability <strong>of</strong> a random walk is<br />

p(a) = 1, p(b) = 1, p(c) = 1, and p(d) = 1 . The maximum degree <strong>of</strong> any vertex is three,<br />

2 4 8 8<br />

so at a, the probability <strong>of</strong> taking the edge (a, b) is 1 1 2<br />

or 1 . The probability <strong>of</strong> taking the<br />

3 4 1 6<br />

edge (a, c) is 1 1 2<br />

or 1 and <strong>of</strong> taking the edge (a, d) is 1 1 2<br />

or 1 . Thus, the probability<br />

3 8 1 12 3 8 1 12<br />

<strong>of</strong> staying at a is 2. The probability <strong>of</strong> taking the edge from b to a is 1 . The probability<br />

3 3<br />

<strong>of</strong> taking the edge from c to a is 1 and the probability <strong>of</strong> taking the edge from d to a is<br />

3<br />

1<br />

. Thus, the stationary probability <strong>of</strong> a is 1 1<br />

+ 1 1<br />

+ 1 1<br />

+ 1 2<br />

= 1 , which is the desired<br />

3 4 3 8 3 8 3 2 3 2<br />

probability.<br />

5.2.2 Gibbs Sampling<br />

Gibbs sampling is another Markov Chain Monte Carlo method to sample from a<br />

multivariate probability distribution. Let p (x) be the target distribution where x =<br />

(x 1 , . . . , x d ). Gibbs sampling consists <strong>of</strong> a random walk on an undirectd graph whose<br />

vertices correspond to the values <strong>of</strong> x = (x 1 , . . . , x d ) and in which there is an edge from<br />

x to y if x and y differ in only one coordinate. Thus, the underlying graph is like a<br />

147

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!