Nonparametric Bayesian Discrete Latent Variable Models for ...
Nonparametric Bayesian Discrete Latent Variable Models for ...
Nonparametric Bayesian Discrete Latent Variable Models for ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
A Details of Derivations <strong>for</strong> the Stick-Breaking Representation of IBP<br />
The µl <strong>for</strong> l ∈ Lk are independent given µ (1:k), there<strong>for</strong>e we can obtain the distribution<br />
<strong>for</strong> µ (k+1) = max µl by taking the product of the cdf’s of µl’s:<br />
l∈Lk<br />
F (µ (k+1) | µ (1:k)) = µ<br />
= µ<br />
α<br />
− K<br />
(k)<br />
K−k<br />
−α K<br />
(k)<br />
µ α<br />
K<br />
(k+1) I(0 ≤ µ (k+1) ≤ µ (k)) + I(µ (k) < µ (k+1)) K−k<br />
µ α K−k<br />
K<br />
(k+1) I(0 ≤ µ (k+1) ≤ µ (k)) + I(µ (k) < µ (k+1)).<br />
Differentiating the above equation, the density of µ (k+1) is obtained to be,<br />
p(µ (k+1) | µ (1:k)) = p(µ (k+1) | µ (k))<br />
= α<br />
K − k<br />
K<br />
K−k<br />
K<br />
µ−α<br />
(k)<br />
µ α K−k<br />
K −1<br />
(k+1)<br />
A.2 Probability of a part of Z being inactive<br />
I(0 ≤ µ (k+1) ≤ µ (k)).<br />
(A.4)<br />
(A.5)<br />
Given µ (k), we can calculate the probability of all entries to the right of column k being<br />
zero. We denote the set of indices after the kth largest feature presence probability with<br />
Lk. The density of the (unordered) feature presence probabilities with index l ∈ Lk<br />
is given in eq. (A.2). The entries zil are Bernoulli distributed with probability µl.<br />
Marginalizing over µl, we have;<br />
<br />
p(Z (:,.>k) = 0 | µ (k)) =<br />
<br />
=<br />
µ (k)<br />
0<br />
p(Z (:,.>k) = 0 | µ (k), µL)p(µL)dµL<br />
α<br />
K<br />
α<br />
K<br />
µ−<br />
(k)<br />
α<br />
µ K −1 (1 − µ) N K−k dµ<br />
Applying change of variables ν = µ/µ(k) to the above integral,<br />
=<br />
=<br />
µ(k)<br />
0<br />
1<br />
0<br />
1<br />
= α<br />
K<br />
Using the binomial series,<br />
116<br />
= α<br />
K<br />
= α<br />
K<br />
1<br />
0<br />
α<br />
K<br />
α<br />
K<br />
0<br />
1<br />
N<br />
i=0<br />
0<br />
α<br />
K<br />
ν α<br />
K −1<br />
α<br />
K<br />
µ−<br />
(k)<br />
µ α<br />
K −1 (1 − µ) N dµ<br />
µ− α<br />
K<br />
(k) (νµ (k)) α<br />
K −1 (1 − νµ (k)) N µ (k)dν<br />
ν α<br />
K −1 (1 − ν + ν − νµ (k)) N dν<br />
ν α<br />
K −1 (1 − ν) + ν(1 − µ (k)) N dν.<br />
N<br />
i=0<br />
<br />
N<br />
(1 − µ<br />
i<br />
(k)) i<br />
<br />
N<br />
(1 − ν)<br />
i<br />
N−i (ν(1 − µ (k))) i dν<br />
1<br />
0<br />
α<br />
i+<br />
ν K −1 (1 − ν) N−i dν.<br />
(A.6)<br />
(A.7)<br />
(A.8)