26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 Indian Buffet Process <strong>Models</strong><br />

P+<br />

P<br />

R<br />

R+<br />

Figure 4.15: Random binary feature matrix samples weighted with the feature weights from<br />

the chain <strong>for</strong> the Paris-Rome example. The gray level indicates the weight of each<br />

feature. Rows correspond to options and columns to features. Only the active<br />

features are shown. It is inferred that there are features that only P and P +<br />

share, and only R and R+ share. Note that there are also some features common<br />

to all options which do not affect the likelihood. The first four columns in these<br />

figures are the unique features, remaining columns are from the posterior of the<br />

infinite part. Note that the weights of the unique features <strong>for</strong> P and R are very<br />

close to zero, which is expected since they are not supported by the data. The<br />

unique features <strong>for</strong> P + and R+ have small weights representing the small bonus.<br />

representations may result in the same choice probabilities. The important point is that<br />

the model can learn the true choice probabilities by inferring that there are features<br />

that only P and P + share, and there are features that only R and R+ share. The<br />

small bonus that R+ and P + have in common is represented as two different features<br />

with similar weights. <strong>Models</strong> on a larger set of alternatives might result in feature<br />

representations that cannot be interpreted easily, as will be seen in the next example.<br />

Celebrities<br />

As a second example we analyze real choice data from an experiment by Rumelhart<br />

and Greeno (1971) that is known to exhibit choice patterns that the BTL model cannot<br />

capture. Subjects were presented with pairs of celebrities and were asked with whom<br />

they would prefer to spend an hour of conversation. There are nine celebrities in the<br />

choice set that were chosen from three groups: three politicians, three athletes and three<br />

movie stars. Individuals within each group are assumed to be more similar to each other<br />

and this should have an effect on the choice probabilities beyond the variation that can<br />

be captured by the BTL model. The data had been collected assuming that the choice<br />

probabilities could be fully captured by a feature matrix that has unique features <strong>for</strong> each<br />

individual plus one feature <strong>for</strong> being a politician, one feature <strong>for</strong> being an athlete and<br />

one feature <strong>for</strong> being a movie star. The assumed feature matrix is shown in Figure 4.16.<br />

This feature matrix can also be depicted as a tree there<strong>for</strong>e we refer to this model as<br />

the tree EBA model (tEBA) (Tversky and Sattath, 1979).<br />

We modeled the choice data with different specifications <strong>for</strong> the EBA model: the<br />

model which assumes all options to have only unique features (BTL), the EBA model<br />

with fixed features of tree structure (tEBA), two finite EBA models with the number of<br />

features fixed to be 12 and 15 (EBA12 and EBA15), and the EBA model with infinitely<br />

108

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!