08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

x 1 + x 2 + x 3 x 1 + x 2 x 1 + x 3 x 2 + x 3<br />

x 1 x 2 x 3<br />

Figure 9.2: The factor graph for the function<br />

f(x 1 , x 2 , x 3 ) = (x 1 + x 2 + x 3 )(x 1 + ¯x 2 )(x 1 + ¯x 3 )(¯x 2 + ¯x 3 ).<br />

In general, the f i are not convex; indeed they may be discrete. So the minimization<br />

cannot be carried out by a known polynomial time algorithm. The most used forms <strong>of</strong> the<br />

Markov random field involve S i which are cliques <strong>of</strong> a graph. So we make the following<br />

definition.<br />

A Markov Random Field consists <strong>of</strong> an undirected graph and an associated function<br />

that factorizes into functions associated with the cliques <strong>of</strong> the graph. The special case<br />

when all the factors correspond to cliques <strong>of</strong> size one or two is <strong>of</strong> interest.<br />

9.6 Factor Graphs<br />

Factor graphs arise when we have a function f <strong>of</strong> a variables x = (x 1 , x 2 , . . . , x n ) that<br />

can be expressed as f (x) = ∏ f α (x α ) where each factor depends only on some small<br />

α<br />

number <strong>of</strong> variables x α . The difference from Markov random fields is that the variables<br />

corresponding to factors do not necessarily form a clique. Associate a bipartite graph<br />

where one set <strong>of</strong> vertices correspond to the factors and the other set to the variables.<br />

Place an edge between a variable and a factor if the factor contains that variable. See<br />

Figure 9.2<br />

9.7 Tree Algorithms<br />

Let f(x) be a function that is a product <strong>of</strong> factors. When the factor graph is a tree<br />

there are efficient algorithms for solving certain problems. With slight modifications, the<br />

algorithms presented can also solve problems where the function is the sum <strong>of</strong> terms rather<br />

than a product <strong>of</strong> factors.<br />

The first problem is called marginalization and involves evaluating the sum <strong>of</strong> f over<br />

all variables except one. In the case where f is a probability distribution the algorithm<br />

311

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!