26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.5 A Choice Model with Infinitely Many <strong>Latent</strong> Features<br />

See Figure 4.13 <strong>for</strong> a graphical representation of the model.<br />

4.5.2 Inference using MCMC<br />

Inference <strong>for</strong> the above model can be done using MCMC techniques. Görür et al. (2006)<br />

present results using the approximate Gibbs sampling <strong>for</strong> updating the feature matrix<br />

Z. Here, we use slice sampling with the semi-ordered stick-breaking representation. We<br />

use Gibbs sampling <strong>for</strong> the IBP parameter α and Metropolis Hastings updates <strong>for</strong> the<br />

weights w.<br />

Gibbs sampling <strong>for</strong> the feature updates requires the posterior of each zik conditioned<br />

on all other features Z −(ik) and the weights w. The conditional posterior <strong>for</strong> the entries<br />

of the feature matrix can be obtained by combining the likelihood given in eq. (4.65)<br />

with the prior feature presence probability µ (k),<br />

P (zik = 1 | Z−ik, w, µ (k)) ∝ µ (k)P (X | zik = 1, Z−ik, w, µ (k)). (4.68)<br />

We update the weights using Metropolis Hastings sampling. We sample a new weight<br />

from a proposal distribution Q(w ′ k |wk) and accept the new weight with probability<br />

min<br />

<br />

1, P w ′ k |X, Z, w−k, wk, λ <br />

P wk|X, Z, w−k, w ′ k , λ Q(wk|w ′ k )<br />

Q(w ′ k |wk)<br />

<br />

. (4.69)<br />

As the proposal distribution we use a gamma distribution with mean equal to the current<br />

value of the weight, wk, and standard deviation proportional to it,<br />

Q(w ′ k |wk) = G(ηwk, η/wk). (4.70)<br />

We adjust η to have an acceptance rate around 0.5 initially.<br />

Note that there are infinitely many weights that are associated with the infinitely<br />

many features. Since the inactive features and their weights do not affect the likelihood,<br />

we need to only represent and update the weights that are associated with the active<br />

features.<br />

4.5.3 Experiments<br />

In this section, we present empirical results on an artificial and a real data set. Both<br />

data sets have been considered in the choice model literature.<br />

We initialize the parameters α, Z and w randomly from their priors and set the lapse<br />

parameter to ε = 0.01.<br />

Paris-Rome<br />

We first consider a synthetic example given by Tversky (1972). It was constructed as<br />

a simple example that the BTL model cannot deal with. We will use this example to<br />

illustrate that the EBA model with infinitely many latent features can recover the latent<br />

structure from choice data.<br />

105

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!