26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.2 MCMC Sampling algorithms <strong>for</strong> IBLF models<br />

the beta process is that a draw from a beta process has the <strong>for</strong>m ∞<br />

(·) with<br />

k=1 µ (k)δθk<br />

µ (k) drawn <strong>for</strong>m the stick breaking prior <strong>for</strong> the feature presence probabilities, and θ (k)<br />

is independent from µ (k) and is drawn from the base measure H. Generalizations of the<br />

stick-breaking constructions lead to generalizations of the beta process.<br />

The IBP parameter α affects the number of active features there<strong>for</strong>e updating this<br />

parameter would give more flexibility to the model. The likelihood <strong>for</strong> α can be derived<br />

from the joint distribution of the features given in Equation 34 of Griffiths and<br />

Ghahramani (2005) to be,<br />

p(Z | α) ∝ α K‡<br />

exp <br />

− αHN ,<br />

where K ‡ is the number of active components and N is the number of rows. We can<br />

put a gamma prior on α,<br />

α ∼ G(1, 1).<br />

Combining this likelihood with the prior, we get the posterior distribution <strong>for</strong> α<br />

p(α | Z) = G 1 + K ‡ <br />

, 1 + HN . (4.33)<br />

Since this posterior is of standard <strong>for</strong>m, it can be sampled from easily.<br />

Note that the construction of IBP and related distributions described above concerns<br />

the prior <strong>for</strong> Z only. Using Z in a IBLF, we update it by sampling its entries using the<br />

full conditional distributions which involves conditioning on the data. In the following<br />

section, we describe methods <strong>for</strong> inference on IBLF models.<br />

4.2 MCMC Sampling algorithms <strong>for</strong> IBLF models<br />

The previous section summarized several different approaches <strong>for</strong> defining the distribution<br />

induced by the Indian buffet process, which has strong connections to the Dirichlet<br />

Process. In this section, we describe MCMC methods <strong>for</strong> inference on the latent feature<br />

models that use the IBP as the nonparametric prior over the feature matrix, which we<br />

refer to as the infinite binary latent feature (IBLF) models.<br />

The general <strong>for</strong>m of the models we will consider assumes the data generating process to<br />

be living in a latent space of possibly infinite dimension. Each data point xi is described<br />

by an infinite dimensional binary latent vector zi = zi,1:∞ that encodes the presence<br />

or absence of the infinitely many features characterized by the parameters Θ = θ1:∞,<br />

and possibly other parameters Φ that live in the observable space. The state of the<br />

binary latent variables determine which features are actively used, that is, the effective<br />

latent dimensionality. Equivalently, we refer to the vector zi as the binary latent feature<br />

vector, and Θ as the set of parameters of the features. We put an IBP prior on the<br />

feature presence matrix Z, which has the vectors zi as its rows. The parameters Θ and<br />

Φ are given standard parametric priors. The data distribution is assumed not to depend<br />

on the features that do not belong to any of the data points, that is, the zero columns<br />

of Z and the parameters θk that are associated with those columns. The model can be<br />

81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!