Nonparametric Bayesian Discrete Latent Variable Models for ...

More documents

Recommendations

Info

3 Dirichlet Process Mixture Models # of components # of components # of components 50 40 30 20 10 0 0 200 2500 5000 iterations 7500 10000 150 100 50 0 0 400 2500 5000 iterations 7500 10000 300 200 100 0 0 2500 5000 iterations 7500 10000 Figure 3.19: Plots showing the change in the number of clusters over the iterations using the 12 PCA components for the conjugate DPMoG (top), the conditionally conjugate DPMoG using the SampleMu scheme (middle) and the DPMFA model with 8 latent factors (bottom). The models are initialized with all data points assigned to a single component, and all parameters and hyperparameters are updated. However, the sampling was not efficient for any of the models to converge. Both the conjugate and the conditionally conjugate DPMoG models stayed in the initial one component state. On the other hand, the DPMFA model employed too many active components, without converging to the stationary distribution. A common technique when dealing with complex data using hierarchical models is to fix the hyperparameters, updating only the parameters in the beginning and start updating all unknown variables after some iterations. This stops the hyperparameters from taking values in a low probability region in an early stage of the chain and prevents the sampler from getting stuck in this low probability region. Following this approach, we tried initializing the component assignments with the manual clustering labels and not updating the hyperparameters and the indicator variables for the first 100 iterations, only updating the parameters. The conjugate and the conditionally conjugate DPMoG models could not mix at all, staying in the initialized state also in this case. The DPMFA model again employed too many components as soon as we started updating the indicator variables. Since the concentration parameter α controls the number of active components in a DPM model, we decided to fix this parameter instead of learning it to limit the prior 60
# of components # of components # of components 25 20 15 10 3.4 Dirichlet Process Mixtures of Factor Analyzers 5 0 250 2500 5000 iterations 7500 10000 200 150 100 50 450 300 0 2500 5000 iterations 7500 10000 0 250 500 iterations 750 1000 Figure 3.20: Change in the number of active components over the iterations for the DPMFA model using the peak-to-peak amplitudes (top), the 12 PCA components (middle) and the whole waveforms (bottom) with the parameter α fixed to 1. probability of proposing new components. To check the effect of this, we also did runs with fixed α using the PCA projections. Recall that for the run on the 12 dimensions where α was updated, the number of active components first increased to 400, converging to around 80 components after 2000 iterations, see Figure 3.19. Interestingly, we observe the similar behavior even when α is fixed to 1, see Figure 3.20. For this case, the number of components increases to 250, and converges to around 40 after 1000 iterations for the 12 dimensional representation. For the 112 dimensional representation, the number of components initially goes up to 500, and settles around 300 after 100 iterations. Although the number of active components is as high as 300, there are generally only 10 components with more than 5 data points. Figure 3.18 shows that the clustering generally agrees with the results of the other experiments. We show the manual clustering results in Figure 3.21 and the DPMFA results using the whole waveforms as inputs in Figure 3.22 for comparison. Note that the DPMFA finds a similar clustering for most of the data, except separating one of the big clusters into three. Superimposed spike waveforms from each of the big clusters found by the DPMFA model are depicted in Figure 3.23. The waveforms from the three clusters c1, c2 and c3 were assumed to belong to one big cluster in manual clustering since they have similar amplitude characteristics. Attending to the shape of the waveforms reveals that they 61
Page 1:
Nonparametric Bayesian Discrete Lat
Page 4 and 5:
Matrizen mit unendlich vielen Spalt
Page 7 and 8:
Contents Zusammenfassung iii Abstra
Page 9:
List of Algorithms 1 Gibbs sampling
Page 13 and 14:
Notation Matrices are capitalized a
Page 15:
Symbol Meaning IBP Z binary latent
Page 18 and 19:
1 Introduction belief in the prior.
Page 20 and 21:
2 Nonparametric Bayesian Analysis b
Page 22 and 23:
2 Nonparametric Bayesian Analysis s
Page 25 and 26: 3 Dirichlet Process Mixture Models
Page 27 and 28: 3.1 The Dirichlet Process the perfo
Page 29 and 30: α G o G θi x i N 3.1 The Dirichle
Page 31 and 32: 15 10 5 −0.5 0 0.5 2 1 G 0 0 −0
Page 33 and 34: increment process with the correspo
Page 35 and 36: α G o π k c i θk x i 8 N 3.1 The
Page 37 and 38: 3.1 The Dirichlet Process Eq. (3.21
Page 39 and 40: number of components, K number of c
Page 41 and 42: 3.2 MCMC Inference in Dirichlet Pro
Page 43 and 44: and Bush and MacEachern (1996). 3.2
Page 45 and 46: 3.2.2 Algorithms for non-Conjugate
Page 55 and 56: ∗ π ∗ π s 3.2 MCMC Inference
Page 57 and 58: model can be written in the form of
Page 59 and 60: −1 µ y Σy Σy D ξ normal R 3.3
Page 61 and 62: the log likelihood term is: where a
Page 63 and 64: 3.3 Empirical Study on the Choice o
Page 65 and 66: autocovariance coefficient 1 0.8 0.
Page 67 and 68: # of data points # of data points 5
Page 69 and 70: 3.4 Dirichlet Process Mixtures of F
Page 71 and 72: µ y Σy ξ R 0 ν w normal µ −1
Page 73 and 74: ch1 ch2 ch3 ch4 3.4 Dirichlet Proce
Page 75: 3.4 Dirichlet Process Mixtures of F
Page 79 and 80: ch 2 ch 3 ch 1 ch 2 ch 3 ch 4 3.4 D
Page 81: 3.5 Discussion 3.5 Discussion In th
Page 84 and 85: 4 Indian Buffet Process Models matr
Page 86 and 87: 4 Indian Buffet Process Models In t
Page 88 and 89: 4 Indian Buffet Process Models α z
Page 90 and 91: 4 Indian Buffet Process Models α
Page 92 and 93: 4 Indian Buffet Process Models The
Page 94 and 95: 4 Indian Buffet Process Models colu
Page 96 and 97: 4 Indian Buffet Process Models Pois
Page 98 and 99: 4 Indian Buffet Process Models z α
Page 100 and 101: 4 Indian Buffet Process Models For
Page 102 and 103: 4 Indian Buffet Process Models ciat
Page 104 and 105: 4 Indian Buffet Process Models rati
Page 106 and 107: 4 Indian Buffet Process Models repr
Page 108 and 109: 4 Indian Buffet Process Models samp
Page 110 and 111: 4 Indian Buffet Process Models feat
Page 112 and 113: 4 Indian Buffet Process Models Algo
Page 114 and 115: 4 Indian Buffet Process Models mixi
Page 116 and 117: 4 Indian Buffet Process Models Figu
Page 118 and 119: 4 Indian Buffet Process Models pres
Page 120 and 121: 4 Indian Buffet Process Models ε
Page 122 and 123: 4 Indian Buffet Process Models LL P
Page 124 and 125: 4 Indian Buffet Process Models P+ P
Page 126 and 127:
4 Indian Buffet Process Models tEBA
Page 128 and 129:
4 Indian Buffet Process Models dist
Page 130 and 131:
5 Conclusions has been defined as a
Page 132 and 133:
A Details of Derivations for the St
Page 135 and 136:
B Mathematical Appendix B.1 Dirichl
Page 137 and 138:
p3 α 1 0.5 0 0 0.5 0.4 0.3 0.2 0.1
Page 139 and 140:
Construction of A Process B.4 Equal
Page 141 and 142:
Bibliography D. Aldous. Exchangeabi
Page 143 and 144:
Bibliography T. S. Ferguson. Prior
Page 145 and 146:
Bibliography L. F. James and J. W.
Page 147 and 148:
Bibliography R. M. Neal. Probabilis
Page 149 and 150:
Bibliography Y. W. Teh, M. I. Jorda
show all

Nonparametric Bayesian Discrete Latent Variable Models for ...

Create successful ePaper yourself

Delete template?

Save as template?