26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

µ<br />

µ<br />

µ<br />

µ<br />

µ<br />

µ<br />

(1)<br />

(2)<br />

(3)<br />

(4)<br />

(5)<br />

(6)<br />

π(5)<br />

π(6)<br />

π(4)<br />

π(3)<br />

π(2)<br />

4.1 The Indian Buffet Process<br />

Figure 4.4: Stick-breaking construction <strong>for</strong> the DP and IBP. The black line at top is the stick<br />

of unit length. The vertical lines show the break point of the stick. The horizontal<br />

blue lines show the length of the stick left after each iteration, which determines<br />

the values <strong>for</strong> the feature presence probabilities µ (k). The red dotted lines show the<br />

discarded pieces of the stick corresponding to the mixing proportions <strong>for</strong> the DP.<br />

We refer to this representation as the stick-breaking construction due to its correspondence<br />

to the standard stick-breaking construction <strong>for</strong> the DP (Sethuraman, 1994;<br />

Ishwaran and James, 2001).<br />

Let πk be the length of stick piece discarded at iteration k. We have,<br />

Making a change of variables vk = 1 − νk,<br />

k−1<br />

πk = (1 − νk)µ (k−1) = (1 − νk)<br />

vk i.i.d.<br />

k−1<br />

∼ Beta(1, α), πk = vk<br />

l=1<br />

π<br />

(1)<br />

<br />

νl. (4.26)<br />

l=1<br />

<br />

(1 − vl) (4.27)<br />

giving the standard stick-breaking construction <strong>for</strong> DP, eq. (3.16).<br />

In both constructions the weights of interest are given in terms of the stick lengths. In<br />

DP, the weights πk are the lengths of the discarded pieces, while in IBP, the weights µ (k)<br />

are the length of the stick we keep, see Figure 4.4. This difference leads to the different<br />

properties of the weights: <strong>for</strong> DP, the stick lengths sum to a length of 1 and are not<br />

strictly decreasing, but are in a size-biased order. In IBP the stick lengths may sum to<br />

a value greater than 1 but are decreasing. For both, the weights decrease exponentially<br />

quickly in expectation.<br />

The particular ordering of the columns results in some interesting properties of the<br />

distribution on Z. As stated earlier, µ (k) is the prior probability of any of the objects<br />

having feature k. There is a positive probability <strong>for</strong> any data point to have any of<br />

the infinitely many features. However, since the feature presence probabilities have a<br />

strictly decreasing order with exponential rate, in expectation the infinite dimensional<br />

Z matrix has only finitely many non-zero entries.<br />

We can calculate the distribution of the number of non-zero entries to the right of<br />

77

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!