26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

the log likelihood term is:<br />

where<br />

and<br />

3.3 Empirical Study on the Choice of the Base Distribution<br />

log p(xi | c−i, ρ, ξ, β, W ) = − D<br />

D<br />

2 log π + 2 log<br />

ρ + nj<br />

ρ + nj + 1<br />

+ log Γ( β + nj + 1<br />

2<br />

) − log Γ( β + nj + 1 − D<br />

)<br />

2<br />

+ β+nj<br />

2 log |W ∗ | − β+nj+1<br />

2 log |W ∗ + ρ+nj<br />

ρ+nj+1 (xi − ξ ∗ )(xi − ξ ∗ ) T |,<br />

ξ ∗ =(ρξ + <br />

l:cl=j<br />

<br />

xl) (ρ + nj)<br />

W ∗ = βW + ρξξ T + <br />

xlx T l − (ρ + nj)ξ ∗ ξ ∗T ,<br />

l:cl=j<br />

(3.65)<br />

which simplifies considerably <strong>for</strong> the inactive classes.<br />

The sampling iterations become:<br />

– Gibbs sample µ j and Sj conditional on the data, the indicators and the hyperparameters<br />

– Update hyperparameters conditional on µ j and Sj<br />

– Remove the parameters, µ j and Sj from representation<br />

– Gibbs sample <strong>for</strong> each indicator variable, conditional on the data, the other indicators<br />

and the hyperparameters<br />

– Sample <strong>for</strong> the DP concentration α, using ARS<br />

Conditionally Conjugate Model<br />

As a consequence of not using fully conjugate priors, the posterior conditional class<br />

probabilities <strong>for</strong> inactive classes cannot be computed analytically. Here, we give details<br />

<strong>for</strong> using the auxiliary variable sampling scheme of Neal (2000), summarized in Section<br />

3.2.2 as Algorithm 5, and also show how to improve this algorithm by making use<br />

of the conditional conjugacy.<br />

SampleBoth<br />

For each observation xi in turn, the updates are per<strong>for</strong>med by: “invent” auxiliary classes<br />

by picking means µ j and precisions Sj from their priors. We update ci using Gibbs sampling<br />

(i.e., sample from the discrete conditional posterior class distribution), and finally<br />

remove the components that are no longer associated with any observations. This is the<br />

algorithm by (Neal, 2000). Here, to emphasize the difference between the other sampling<br />

schemes that we will describe, we call it the SampleBoth scheme since both means<br />

and precisions are sampled from their priors to represent the inactive components.<br />

The sampling iterations are as follows:<br />

– Gibbs sample µ j and Sj conditional on the indicators and hyperparameters .<br />

– Update hyperparameters conditional on µ j and Sj<br />

45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!