26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.2 MCMC Inference in Dirichlet Process Mixture <strong>Models</strong><br />

variables. This might result in faster mixing chains compared to the Pólya urn samplers<br />

in some cases. The conditional <strong>for</strong> ci is<br />

ci | xi, π1, . . . , πM, θ1, . . . , θM ∼<br />

M<br />

πkiδk(·) (3.44)<br />

where the mixing proportions of the posterior are given by πki ∝ πkF (xi | θk).<br />

Algorithm 8 Gibbs sampling <strong>for</strong> truncated DP<br />

The state of the Markov chain consists of the component parameters θ1, . . . , θM,<br />

mixing proportions π1, . . . , πM and indicator variables ci, . . . , cN.<br />

Repeatedly sample:<br />

<strong>for</strong> all i = 1, . . . , N do {indicator updates}<br />

Assign ci to one of the M components with probability ∝ πkF (xi | θk)<br />

end <strong>for</strong><br />

<strong>for</strong> all k = 1, . . . , M do {parameter updates}<br />

Update θk by sampling from its posterior ∝ F (xi, ∀ci = k | θk)G0<br />

end <strong>for</strong><br />

<strong>for</strong> all k = 1, . . . , M do {mixing proportion updates }<br />

using eq. (3.43)<br />

Sample the posterior breaking points v∗ k<br />

Set πk = v∗ k−1 k l=1 v∗ l<br />

end <strong>for</strong><br />

This algorithm does not require conjugacy, as we do not need to integrate over the<br />

parameters <strong>for</strong> indicator variable updates. Gibbs sampling is easy to implement <strong>for</strong><br />

the truncated Dirichlet process using the stick-breaking construction. However, it is<br />

desirable to avoid approximations and sample from the exact posterior distribution. In<br />

the following sections, we describe methods by Papaspiliopoulos and Roberts (2005)<br />

and Walker (2006) that use the stick-breaking representation without truncating the<br />

process.<br />

Retrospective Sampling<br />

Retrospective sampling of Papaspiliopoulos and Roberts (2005) suggests allocating components<br />

as needed instead of using truncation. Since the stick lengths sum up to 1,<br />

prior assignment of the indicator variables is straight<strong>for</strong>ward by starting with only a<br />

few breaks of the stick (components) and breaking the stick more as a point in the<br />

unbroken part is sampled, see Figure 3.8. The posterior distribution of the indicator<br />

variables is given by the combination of the mixing proportions and the likelihood.<br />

In this representation, the clusters are not exchangeable since the mixing proportions<br />

differ. There<strong>for</strong>e, we would need to sum over the infinitely many cases to obtain the<br />

normalizing constant <strong>for</strong> the posterior. Retrospective sampling provides a way to avoid<br />

evaluating this infinite sum by using a Metropolis-Hastings step.<br />

k=1<br />

35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!