26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Indian Buffet Process <strong>Models</strong><br />

representation. We can sample entries of all columns from their conditional posterior<br />

p(zik | µ (k), X, Θ, Φ) ∝ µ (k)F (X | zik = 1, Z−ik, X, Θ, Φ). (4.44)<br />

Since the feature presence probabilities are explicitly represented in this construction,<br />

they should also be updated. We obtain the posterior <strong>for</strong> µ (k) by combining the prior<br />

from eq. (4.42) and the likelihood from eq. (4.43),<br />

P (µ (k) | Z, µ −(k)) = α M µ α (M) µmk−1<br />

(k) (1 − µ (k)) N−mk I{µ(k+1) ≤ µ (k) ≤ µ (k−1)} (4.45)<br />

This density is not of standard <strong>for</strong>m. But it can be shown that the log posterior is<br />

log-concave, there<strong>for</strong>e it can be sampled from efficiently using ARS.<br />

Algorithm 14 Gibbs sampling <strong>for</strong> truncated IBP<br />

The state of the Markov chain consists of the finite feature matrix Z with M columns,<br />

the feature presence probabilities µ1:M = µ1, . . . , µM corresponding to each feature<br />

column and the set of parameters Θ = {θk} M 1 . All variables are represented.<br />

Repeatedly sample:<br />

<strong>for</strong> all rows i = 1, . . . , N and columns k=1,. . . ,M do {Feature updates}<br />

Update zik by sampling from its conditional posterior, eq. (4.44).<br />

end <strong>for</strong><br />

<strong>for</strong> all columns k = 1, . . . , M do {Parameter updates}<br />

Update θk by sampling from its conditional posterior, eq. (4.35)<br />

end <strong>for</strong><br />

<strong>for</strong> all columns k = 1, . . . , M do {Update feature presence probabilities}<br />

Update µk by sampling from its conditional posterior, eq. (4.45) using ARS<br />

end <strong>for</strong><br />

4.2.5 Slice Sampling Using the Stick Breaking Construction<br />

Inference on the truncated stick-breaking construction using Gibbs sampling is easy<br />

to implement and due to the ordering of the feature presence probabilities, the error<br />

introduced by the truncation can be bounded. However, it is possible to avoid approximation<br />

all together and do inference on the complete nonparametric model by using<br />

slice sampling (Teh, Görür, and Ghahramani, 2007). This method can be interpreted<br />

as adaptively choosing a truncation level at each iteration. See (Neal, 2003) <strong>for</strong> an<br />

overview of slice sampling.<br />

Slice sampling has been successfully applied to DP mixture models by Walker (2006)<br />

(see Section 3.2.3), and the algorithm <strong>for</strong> IBLF models described below is related to<br />

this approach.<br />

In detail, we introduce an auxiliary slice variable,<br />

90<br />

s | Z, µ (1:∞) ∼ Uni<strong>for</strong>m[0, µ ∗ ] (4.46)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!