26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.2 MCMC Inference in Dirichlet Process Mixture <strong>Models</strong><br />

Algorithm 5 Gibbs sampling <strong>for</strong> non-conjugate DPM models using auxiliary components<br />

The state of the Markov chain consists of the indicator variables c = {c1, . . . , cN} and<br />

the parameters of the active components Φ = {φ1, . . . , φ K ‡}<br />

Repeatedly sample:<br />

<strong>for</strong> all i = 1, . . . , N do {indicator updates}<br />

if ci is a singleton then<br />

Assign φci<br />

to be the parameter of one of the auxiliary components<br />

Draw values from G0 <strong>for</strong> the rest of the auxiliary parameters<br />

else<br />

Draw values from G0 <strong>for</strong> all the ζ auxiliary parameters<br />

end if<br />

Update ci using eq. (3.34)<br />

Discard the inactive components<br />

end <strong>for</strong><br />

<strong>for</strong> all k = 1, . . . , K ‡ do {parameter updates}<br />

Update φk by sampling from its posterior, eq. (3.31)<br />

end <strong>for</strong><br />

Metropolis-Hastings Updates<br />

Neal (2000) proposes to use Metropolis-Hastings updates combined with partial Gibbs<br />

updates. We can use a Metropolis Hastings algorithm using the conditional priors<br />

as the proposal distribution, leading to the acceptance probability to be the ratio of<br />

the likelihoods. That is, we propose ci to be equal to one of the existing components<br />

with probability n−i,k/(N − 1 + α) and propose to create a singleton with probability<br />

α/(N − 1 + α). The proposal is evaluated with the ratio of the likelihoods.<br />

The probability of considering to create a new component would be very low if α<br />

is small relative to the number of data points N. There<strong>for</strong>e Neal (2000) changes the<br />

proposal distribution so as to increase the probability of considering to propose <strong>for</strong>ming<br />

a new component. And, to facilitate mixing, he adds partial Gibbs sampling steps <strong>for</strong><br />

the members of the non-singleton components in which only changing ci to one of the<br />

existing components is considered. In detail, if the data point i belongs to a singleton, it<br />

will be considered to be assigned to only one of the existing components with assignment<br />

probability n−i,k/(N − 1). The acceptance ratio of this proposal is<br />

<br />

min 1, α<br />

N − 1<br />

F (xi | φc ∗ i )<br />

F (xi | φci )<br />

<br />

. (3.35)<br />

And, whenever ci is not a singleton, proposing to change it to a newly created component<br />

will be proposed with probability 1, and a parameter <strong>for</strong> this component will be sampled<br />

31

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!