26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Dirichlet Process Mixture <strong>Models</strong><br />

posterior distributions conditional on all other variables. As a general summary, we<br />

iterate:<br />

– Update mixture parameters (mean and precision)<br />

– Update hyperparameters<br />

– Update the indicators, conditional on the other indicators and the (hyper)parameters<br />

– Update the DP concentration parameter α<br />

For the models we consider, the conditional posteriors <strong>for</strong> all parameters and hyperparameters<br />

except <strong>for</strong> α, β and the indicator variables ci are of standard <strong>for</strong>m, thus can<br />

be sampled from easily. The conditional posteriors of log(α) and log(β) are both logconcave,<br />

so they can be updated using Adaptive Rejection Sampling (ARS) (Gilks and<br />

Wild, 1992) as suggested in (Rasmussen, 2000). We have described several algorithms<br />

<strong>for</strong> updating the indicator variables of the DPM models in Section 3.2. For the conjugate<br />

case, we use Algorithm 3 described in Section 3.2.1 that makes full use of conjugacy<br />

to integrate out the component parameters, leaving only the indicator variables as the<br />

state of the Markov chain. For the conditionally conjugate case, we use Algorithm 5<br />

described in Section 3.2.2 utilizing only one auxiliary variable.<br />

The likelihood <strong>for</strong> components that have observations associated with them is given<br />

by the parameters of that component, and the likelihood pertaining to currently inactive<br />

classes (which have no mixture parameters associated with them) is obtained through<br />

integration over the prior distribution. The conditional posterior class probabilities are<br />

calculated by multiplying the likelihood term by the prior.<br />

The conditional posterior class probabilities <strong>for</strong> the DPMoG are:<br />

components <strong>for</strong> which n−i,j > 0: p(ci = j|c−i, µ j, Sj, α)<br />

∝ p(ci = j|c−i, α)p(xi|µ j, Sj)<br />

∝ n−i,j<br />

n − 1 + α N (x|µ j, Sj),<br />

all others combined: p(ci = ci ′ <strong>for</strong> all i = i′ |c−i, ξ, ρ, β, W, α)<br />

α<br />

∝<br />

n − 1 + α ×<br />

<br />

p(xi|µ, S)p(µ, S|ξ, ρ, β, W )dµdS.<br />

(3.64a)<br />

(3.64b)<br />

We can evaluate the above integral to obtain the conditional posterior of the inactive<br />

classes in the conjugate case, but it is analytically intractable in the non-conjugate case.<br />

Conjugate Model<br />

When the priors are conjugate, the integral in eq. (3.64b) is analytically tractable. In<br />

fact, even <strong>for</strong> the active classes, we can marginalize out the component parameters using<br />

an integral over their posterior, by analogy with the inactive classes. Thus, in all cases<br />

44

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!