26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Dirichlet Process Mixture <strong>Models</strong><br />

−1<br />

µ y 1 1 Σ D D<br />

ξ<br />

Σ y<br />

normal<br />

ρ<br />

ρS<br />

gamma<br />

w<br />

µ S<br />

y<br />

normal<br />

1<br />

β β−D+1<br />

w −1<br />

Wishart inv gamma<br />

k<br />

k<br />

normal Wishart<br />

x<br />

i<br />

N<br />

K<br />

hyperpriors<br />

hyperparameters<br />

mixture parameters<br />

observations<br />

Figure 3.10: Graphical representation of the layered structure of the hierarchical priors in the<br />

MoG model with conjugate priors. Note the dependency of the distribution of<br />

the component mean on the component precision. <strong>Variable</strong>s are labelled below by<br />

the name of their distribution, and the parameters of these distributions are given<br />

above.<br />

particular value. The graphical representation of the hierarchical model is depicted in<br />

Figure 3.10.<br />

Note in eq. (3.57) that the precision <strong>for</strong> the mean is a multiple of the component precision<br />

itself. This dependency is probably not generally desirable, but is an unavoidable<br />

consequence of requiring conjugacy.<br />

Conditionally Conjugate DPMoG<br />

If we remove the undesired dependency of the mean on the precision, we no longer have<br />

conjugacy. For a more realistic model, we define the prior on µ j to be<br />

p(µ j|ξ, R) ∼ N (ξ, R −1 ) (3.60)<br />

whose mean vector ξ and precision matrix R are hyperparameters common to all mixture<br />

components. Keeping the Wishart prior over the precisions as in eq. (3.58), we obtain<br />

the conditionally conjugate model. That is, the prior of the mean is conjugate to the<br />

likelihood conditional on S and the prior of the precision is conjugate conditional on µ.<br />

See Figure 3.11 <strong>for</strong> the graphical representation of the hierarchical model.<br />

We put hyperpriors on the hyperparameters in both prior specifications to have a<br />

robust model. We use the hierarchical model specification of Rasmussen (2000) <strong>for</strong> the<br />

conditionally conjugate model, and a similar specification <strong>for</strong> the conjugate case. Vague<br />

priors are given to the hyperparameters, some of which depend on the observations which<br />

technically they ought not to. However, only the empirical mean µ x and the covariance<br />

Σx of the data are used in such a way that the full procedure becomes invariant to<br />

42

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!