26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

model can be written in the <strong>for</strong>m of eq. (3.19):<br />

3.3 Empirical Study on the Choice of the Base Distribution<br />

xi | ci, Θ ∼ N (µ ci , S−1 ci )<br />

ci | π ∼ <strong>Discrete</strong>(π1, . . . , πK)<br />

(µ j, Sj) ∼ G0<br />

π | α ∼ D(α/K, . . . , α/K).<br />

(3.54)<br />

We obtain the Dirichlet Process mixture of Gaussians (DPMoG) model by integrating<br />

out the mixing proportions and taking the limit K → ∞, as discussed in Section 3.1.5.<br />

Equivalently, we can define the model by starting with Gaussian distributed data<br />

xi|θ ∼ N (xi|µ i, S −1<br />

i ), (3.55)<br />

with a random distribution of the model parameters (µ i, Si) ∼ G drawn from a DP,<br />

G ∼ DP (α, G0).<br />

We put an inverse gamma prior on the DP concentration parameter α,<br />

α −1 ∼ G(1/2, 1/2), (3.56)<br />

the choice of which is discussed in detail in Section 3.1.6. We need to specify the<br />

base distribution G0 to complete the model. For DPMoG, G0 specifies the mean of<br />

the joint distribution of µ and S. For this model, a conjugate base distribution exists<br />

however it has the unappealing property of prior dependency between the mean and the<br />

covariance. We proceed by giving the detailed model <strong>for</strong>mulation <strong>for</strong> both conjugate<br />

and conditionally conjugate cases.<br />

Conjugate DPMoG<br />

The natural choice of priors <strong>for</strong> the mean of the Gaussian is a Gaussian, and a Wishart<br />

distribution <strong>for</strong> the precision (inverse-Wishart <strong>for</strong> the covariance). To accomplish conjugacy<br />

of the joint prior distribution of the mean and the precision to the likelihood,<br />

the distribution of the mean has to depend on the precision. The prior distribution of<br />

the mean µ j is Gaussian conditioned on Sj:<br />

and the prior distribution of Sj is Wishart:<br />

(µ j|Sj, ξ, ρ) ∼ N ξ, (ρSj) −1 , (3.57)<br />

(Sj|β, W ) ∼ W β, (βW ) −1 . (3.58)<br />

The joint distribution of µ j and Sj is the Normal/Wishart distribution denoted as:<br />

(µ j, Sj) ∼ N W(ξ, ρ, β, βW ), (3.59)<br />

with ξ, ρ, β and W being hyperparameters common to all mixture components, expressing<br />

the belief that the component parameters should be similar, centered around some<br />

41

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!