26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ch 2<br />

ch 3<br />

ch 1<br />

ch 2<br />

ch 3<br />

ch 4<br />

3.4 Dirichlet Process Mixtures of Factor Analyzers<br />

ch 1<br />

ch 2<br />

Figure 3.22: Peak-to-peak amplitude plots showing the clustering results of DPMFA using the<br />

whole wave<strong>for</strong>ms. Note that one big cluster of manual clustering (light blue) is<br />

divided into three clusters with this model (light blue, brown and purple).<br />

Gibbs sampling.<br />

The behavior of the Markov chain <strong>for</strong> the DPMFA model on the PCA components and<br />

the whole wave<strong>for</strong>ms is interesting. No matter if initialized with a single component or<br />

many components, the model first explores the space by introducing many components<br />

and eventually reduces the number of active components to give a good representation of<br />

the data. The sampling can handle this <strong>for</strong> the lower dimensional PCA inputs, however<br />

<strong>for</strong> the 112 dimensional inputs, it fails to converge to a stationary distribution when all<br />

variables are updated. For this reason, <strong>for</strong> using the wave<strong>for</strong>ms as inputs, we fixed the<br />

concentration parameter α to 1. An alternative can be to have better initial values <strong>for</strong><br />

the hyperparameters but this would involve a thorough analysis of the data.<br />

The motivation <strong>for</strong> developing this model is to be able to apply the DPM model to high<br />

dimensional data. Experimental results show that the sampling algorithms used are not<br />

efficient enough. Using the split-merge method of Jain and Neal (2005) would speedup<br />

mixing. Furthermore, the factor loading matrix is rotation independent, there<strong>for</strong>e not<br />

unique. Restricting it to be unique, e.g. <strong>for</strong>cing it to be lower diagonal as suggested by<br />

Fokoué and Titterington (2003) may also help speedup mixing.<br />

ch 4<br />

ch 4<br />

ch 1<br />

ch 3<br />

63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!