26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4 Dirichlet Process Mixtures of Factor Analyzers<br />

Figure 3.18: Confusion matrices <strong>for</strong> the cluster assignments. Top row: Using the peak-topeak<br />

amplitudes <strong>for</strong>: the manual clustering (left), the conjugate DPMoG model<br />

(middle) and the conditionally conjugate DPMoG model (right). Bottom row:<br />

DPMFA results using: the peak-to-peak amplitudes (left), the 12 PCA components<br />

(middle) and the whole wave<strong>for</strong>ms (right). The data is divided into one small<br />

(with 18 data points) and five big clusters (ranging from 370 to 1725 data points)<br />

by manual clustering. Note that the small cluster on the lower right hand corner<br />

of the confusion matrices is not really visible in this plot. All models find similar<br />

clustering using the amplitude data except separating one of the large clusters into<br />

two. The data is divided into more clusters by the DPMFA model using the PCA<br />

projections and the wave<strong>for</strong>ms as input.<br />

of DPMFA in higher dimensions is that the mixing <strong>for</strong> the DPFMA model is relatively<br />

fast when exploiting the conditional conjugacy.<br />

The models were initialized with all data points assigned to the same component. We<br />

see in Figure 3.20 that the Markov chain <strong>for</strong> DPMFA using PCA components as inputs<br />

has an interesting behavior. Observing the change in the number of active components<br />

over the iterations, we see that the model initially employs around 400 components to<br />

represent the data, and starts reducing this number after some iterations, stabilizing<br />

after 2000 iterations at about 80 active components.<br />

We also used the whole wave<strong>for</strong>ms as input without a preliminary feature extraction<br />

step. Initially we tried learning all the parameters and hyperparameters, starting with<br />

a single component, like we did with the lower dimensional representations of the data.<br />

59

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!