08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

computes the marginal probabilities and thus the word marginalization. The second problem<br />

involves computing the assignment to the variables that maximizes the function f.<br />

When f is a probability distribution, this problem is the maximum a posteriori probability<br />

or MAP problem.<br />

If the factor graph is a tree, then there exists an efficient algorithm for solving these<br />

problems. Note that there are four problems: the function f is either a product or a sum<br />

and we are either marginalizing or finding the maximizing assignment to the variables. All<br />

four problems are solved by essentially the same algorithm and we present the algorithm<br />

for the marginalization problem when f is a product. Assume we want to “sum out” all<br />

the variables except x 1 , leaving a function <strong>of</strong> x 1 .<br />

Call the variable node associated with the variable x i node x i . First, make the node x 1<br />

the root <strong>of</strong> the tree. It will be useful to think <strong>of</strong> the algorithm first as a recursive algorithm<br />

and then unravel the recursion. We want to compute the product <strong>of</strong> all factors occurring<br />

in the sub-tree rooted at the root with all variables except the root-variable summed out.<br />

Let g i be the product <strong>of</strong> all factors occurring in the sub-tree rooted at node x i with all<br />

variables occurring in the subtree except x i summed out. Since this is a tree, x 1 will not<br />

reoccur anywhere except the root. Now, the grandchildren <strong>of</strong> the root are variable nodes<br />

and suppose for recursion, each grandchild x i <strong>of</strong> the root, has already computed its g i . It<br />

is easy to see that we can compute g 1 as follows.<br />

Each grandchild x i <strong>of</strong> the root passes its g i to its parent, which is a factor node. Each<br />

child <strong>of</strong> x 1 collects all its children’s g i , multiplies them together with its own factor and<br />

sends the product to the root. The root multiplies all the products it gets from its children<br />

and sums out all variables except its own variable, namely here x 1 .<br />

Unraveling the recursion is also simple, with the convention that a leaf node just receives<br />

1, product <strong>of</strong> an empty set <strong>of</strong> factors, from its children. Each node waits until it<br />

receives a message from each <strong>of</strong> its children. After that, if the node is a variable node,<br />

it computes the product <strong>of</strong> all incoming messages, and sums this product function over<br />

all assignments to the variables except for the variable <strong>of</strong> the node. Then, it sends the<br />

resulting function <strong>of</strong> one variable out along the edge to its parent. If the node is a factor<br />

node, it computes the product <strong>of</strong> its factor function along with incoming messages from<br />

all the children and sends the resulting function out along the edge to its parent.<br />

The reader should prove that the following invariant holds assuming the graph is a tree:<br />

Invariant The message passed by each variable node to its parent is the product <strong>of</strong><br />

all factors in the subtree under the node with all variables in the subtree except its own<br />

summed out.<br />

312

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!