26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Dirichlet Process Mixture <strong>Models</strong><br />

# of components<br />

# of components<br />

# of components<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0<br />

200<br />

2500 5000<br />

iterations<br />

7500 10000<br />

150<br />

100<br />

50<br />

0<br />

0<br />

400<br />

2500 5000<br />

iterations<br />

7500 10000<br />

300<br />

200<br />

100<br />

0<br />

0 2500 5000<br />

iterations<br />

7500 10000<br />

Figure 3.19: Plots showing the change in the number of clusters over the iterations using the<br />

12 PCA components <strong>for</strong> the conjugate DPMoG (top), the conditionally conjugate<br />

DPMoG using the SampleMu scheme (middle) and the DPMFA model with 8 latent<br />

factors (bottom). The models are initialized with all data points assigned to a single<br />

component, and all parameters and hyperparameters are updated.<br />

However, the sampling was not efficient <strong>for</strong> any of the models to converge. Both the<br />

conjugate and the conditionally conjugate DPMoG models stayed in the initial one<br />

component state. On the other hand, the DPMFA model employed too many active<br />

components, without converging to the stationary distribution.<br />

A common technique when dealing with complex data using hierarchical models is<br />

to fix the hyperparameters, updating only the parameters in the beginning and start<br />

updating all unknown variables after some iterations. This stops the hyperparameters<br />

from taking values in a low probability region in an early stage of the chain and prevents<br />

the sampler from getting stuck in this low probability region. Following this approach,<br />

we tried initializing the component assignments with the manual clustering labels and<br />

not updating the hyperparameters and the indicator variables <strong>for</strong> the first 100 iterations,<br />

only updating the parameters. The conjugate and the conditionally conjugate DPMoG<br />

models could not mix at all, staying in the initialized state also in this case. The<br />

DPMFA model again employed too many components as soon as we started updating<br />

the indicator variables.<br />

Since the concentration parameter α controls the number of active components in a<br />

DPM model, we decided to fix this parameter instead of learning it to limit the prior<br />

60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!