26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1 Introduction<br />

belief in the prior. The specification of the base distribution <strong>for</strong> the DP, which corresponds<br />

to the prior on the component parameters <strong>for</strong> infinite mixture models, is often<br />

guided by mathematical and practical convenience. For DPM models, the use of conjugate<br />

priors makes the analysis much more tractable, however conjugate priors might fail<br />

to represent the prior beliefs. Specifically <strong>for</strong> the DPMoG model, the conjugate priors<br />

have some unappealing properties with prior dependencies between the mean and the<br />

covariance. Empirical assessment of the trade off between modeling per<strong>for</strong>mance and<br />

computational cost encountered when using conjugate priors <strong>for</strong> DPMoG is one of the<br />

primary goals of this thesis. In Section 3.3 we compare the DPMoG model with a conjugate<br />

and a conditionally-conjugate base distribution in terms of modeling per<strong>for</strong>mance<br />

and computational feasibility. It is possible to integrate out some of the parameters in<br />

the conditionally conjugate model, which vastly improves mixing. We show that this improvement<br />

makes possible the practical use of the more flexible conditionally conjugate<br />

model.<br />

Mixtures of factor analyzers (MFA) is a mixture model that has Gaussian components<br />

with constrained covariance matrices. Assuming that most of the in<strong>for</strong>mation lies in<br />

a lower dimensional space, high dimensional data can be efficiently modeled using this<br />

reduced parametrization. In Section 3.4 we define the Dirichlet process mixtures of<br />

factor analyzers (DPMFA) model and apply it to a challenging problem of clustering<br />

neuronal data, known as spike sorting.<br />

The second group of nonparametric models we consider is the latent feature models<br />

with infinite number of binary features. The IBP is a distribution over infinite binary<br />

sparse matrices that has many correspondences to the DP. Using IBP as a prior over the<br />

latent features, we can define nonparametric latent feature models. Chapter 4 starts<br />

with the different approaches <strong>for</strong> defining the distribution induced by the IBP, and<br />

discusses the parallels to the DP. Section 4.2 describes different MCMC algorithms <strong>for</strong><br />

inference on IBP models. The sampling algorithms are compared in Section 4.3 to give<br />

an intuition about their general per<strong>for</strong>mance. We demonstrate the modeling capability<br />

of the IBP models <strong>for</strong> learning latent features of handwritten digits in Section 4.4.<br />

The elimination by aspects (EBA) model is an interesting choice model in which the<br />

latent variables are used to represent the features of the alternatives in the choice set<br />

that lead to the choice probabilities. In Section 4.5, we define the EBA model with<br />

IBP prior on the latent features and infer the choice probabilities by learning the latent<br />

features using this model. We conclude the thesis with a discussion in Chapter 5.<br />

The next chapter gives a brief overview of <strong>Bayesian</strong> modeling to introduce the terminology<br />

and the notation that will be used in this thesis. Some mathematical <strong>for</strong>mulas<br />

and statistical definitions are given in Appendix B.<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!