26.10.2013 Views

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

Nonparametric Bayesian Discrete Latent Variable Models for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Indian Buffet Process <strong>Models</strong><br />

In the CRP, exchangeability is achieved by ignoring the labels of the tables and<br />

focusing on the resulting partitioning. A similar approach can be taken <strong>for</strong> the matrices<br />

by ignoring the column order. The analogue of partitions <strong>for</strong> class assignment vectors<br />

are the equivalence classes <strong>for</strong> binary matrices. To establish a well defined distribution in<br />

the infinite limit, Griffiths and Ghahramani (2005) define equivalence classes <strong>for</strong> binary<br />

matrices with respect to a function called left-ordered <strong>for</strong>m (lof (·)) and focus attention<br />

on the distribution over the equivalence class of Z.<br />

Defining the term history of a column of Z to refer to the binary number corresponding<br />

to that column with the first row as the most significant bit, lof (Z) sorts the columns<br />

of Z from left to right according to the histories. Any two binary matrices are lof -<br />

equivalent if they map to the same left-ordered <strong>for</strong>m. The lof -equivalence class [Z] of a<br />

binary matrix Z is the set of binary matrices that are lof -equivalent to Z.<br />

The columns of a matrix with N rows have (2N − 1) possible different histories (other<br />

than zero, which corresponds to the column of zeros). Using the decimal equivalent<br />

of the binary histories, Kh denotes the number of columns with history h. The total<br />

number of columns of Z generated by the IBP can be expressed as K ‡ = 2N −1<br />

h=1 Kh.<br />

Considering the lof -equivalence classes of the matrices, there are<br />

N i=1 K(i)<br />

∗ !<br />

2N −1<br />

h=1 Kh!<br />

matrices Z that can be generated by the sequential process that map to the equivalent<br />

matrix [Z]. There<strong>for</strong>e, the distribution over the equivalent feature matrices generated<br />

by IBP becomes<br />

P ([Z]) = αK‡<br />

2N −1<br />

Kh!<br />

h=1<br />

exp <br />

− αHN<br />

K‡ <br />

k=1<br />

(N − m.,k)!(m.,k − 1)!<br />

, (4.2)<br />

N!<br />

which accounts to both the customers and the dishes being exchangeable.<br />

Griffiths and Ghahramani (2005) also describe the ”exchangeable Indian buffet process”<br />

that directly produces matrices of the left-ordered <strong>for</strong>m, which is established by<br />

the customers attending to the histories of the dishes.<br />

The process described has been inspired by the Chinese restaurant process. In the<br />

CRP, customers choose which table to sit at, and favoring to be social, they tend to sit<br />

at the more crowded tables. Since each customer can sit at only one table, the CRP<br />

results in a partitioning on the customers. In the IBP, customers decide which dishes to<br />

sample. They taste the more popular dishes with higher probability, and may also taste<br />

some dishes which nobody tasted be<strong>for</strong>e. Every customer is free to sample any number<br />

from the infinitely many dishes. Since customers can sample more than one dish, their<br />

choices does not result in a partitioning, but can be seen as binary features that are<br />

shared between data points.<br />

Note that, in the CRP, although there are infinitely many tables available, the total<br />

70

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!