26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Indian Buffet Process <strong>Models</strong><br />

column k in row i, given µ (k). This does not depend on i due to the exchangeability<br />

of the rows. The distribution of the unordered feature presence probabilities µl after<br />

the kth largest one is given by eq. (A.2). The entries zil are Bernoulli distributed with<br />

probability µl. Marginalizing over µl, we have;<br />

p(zil | µ (k)) =<br />

=<br />

µ(k)<br />

0<br />

µ(k)<br />

0<br />

p(zil | µl)p(µl | µ (k))dµl<br />

α<br />

= α<br />

α + K µ (k).<br />

µl<br />

K µ−α/K<br />

(k) µ α/K−1<br />

l dµl<br />

(4.28)<br />

Taking the limit as K → ∞ of the Bernoulli trials with the above probability results in<br />

a Poisson(αµ (k)) distribution over the number of non-zero entries in the ith row. This<br />

result is intuitive, stating that the expected number of non-zero entries depends on the<br />

limiting µ (k) value, and the parameter α. If we consider the entries of the whole row,<br />

i.e. the case k = 0, we recover the distribution derived in the previous section <strong>for</strong> the<br />

total number of non-zero entries in a row, Poisson(α).<br />

We can also calculate the probability of all entries to the right of column k being zero:<br />

p(Z (:,.>k) = 0 | µ (k)) = exp − αHN + α<br />

N<br />

i=1<br />

(1 − µ (k)) i<br />

i<br />

, (4.29)<br />

where HN is the Nth harmonic number. See Appendix A <strong>for</strong> the details of derivation.<br />

4.1.3 A Special Case of the Beta Process<br />

Beta process has been defined <strong>for</strong> use in survival analysis as a cumulative hazard rate<br />

process with nonnegative independent increments 2 by Hjort (1990). Thibaux and Jordan<br />

(2007) show that a special case of the beta process is related to the Indian buffet process<br />

(IBP) the way the Dirichlet process is related to the Chinese restaurant process.<br />

They define the following generative model <strong>for</strong> the binary feature matrix Z:<br />

A ∼ BP(c, αA0),<br />

Z ∼ BeP(A),<br />

(4.30)<br />

where BeP(A) denotes a Bernoulli process with hazard measure A and BP(c, αA0)<br />

denotes a beta process with base measure αA0 and concentration function c. A0 is a<br />

probability measure, α a positive scalar, and c a deterministic function. The distribution<br />

of Z is equivalent to that of the matrix generated by the IBP(α) <strong>for</strong> the particular choice<br />

of c being the constant function c = 1, BP(1, αA0) .<br />

Considering the beta process BP(c, αA0) with a constant concentration function in<br />

the above generative model results in a two-parameter generalization of the Indian buffet<br />

2 See Appendix B.3 <strong>for</strong> the definition of the independent increment processes<br />

78

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!