24.12.2012 Views

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1679<br />

Appendix D<br />

The VB method can be easily extended to <strong>the</strong> hierarchical <strong>Bayes</strong> model. Let<br />

us c<strong>on</strong>sider <strong>the</strong> EFH model (see equati<strong>on</strong> 2.1) with <strong>the</strong> prior distributi<strong>on</strong><br />

P a (µ | ®0, c 0). The evidence for <strong>the</strong> hierarchical <strong>Bayes</strong> model is given by <strong>the</strong><br />

marginal likelihood with respect to <strong>the</strong> model parameter µ and <strong>the</strong> prior<br />

hyperparameter ®0,<br />

Z<br />

P(XfTg) D dm (h )dm (®0)P(XfTg| µ )Pa (µ | ®0, c 0)P0(®0), (D.1)<br />

where P0(®0) is <strong>the</strong> prior distributi<strong>on</strong> for <strong>the</strong> prior hyperparameter ®0. The<br />

free energy is de�ned by<br />

Z<br />

F(XfTg, Q) D dm (µ )dm (®0)dm (ZfTg)Q(µ, ®0, ZfTg)<br />

�<br />

P(XfTg, ZfTg| µ )Pa (µ<br />

£ log<br />

| ´<br />

®0, c 0)P0(®0)<br />

. (D.2)<br />

Q(µ, ®0, ZfTg)<br />

The hierarchical VB method can be obtained assuming <strong>the</strong> c<strong>on</strong>jugate prior<br />

for P a (µ | ®0, c 0),<br />

P0(®0) D exp [b0 (a0®0c 0 ¡ Wa (®0, c 0)) ¡ Wa(a0, b0)] , (D.3)<br />

and <strong>the</strong> factorizati<strong>on</strong> for <strong>the</strong> trial posterior distributi<strong>on</strong>,<br />

Q(µ, ®0, ZfTg) D Q h (µ )Q a (®0)Qz(ZfTg). (D.4)<br />

The remaining calculati<strong>on</strong>s can be d<strong>on</strong>e by <strong>the</strong> same way as in <strong>the</strong> VB<br />

method. The VB algorithm in this case c<strong>on</strong>sists of three steps. The posterior<br />

probability for <strong>the</strong> hidden variable P(z(t)|x(t), N µ ) is calculated in <strong>the</strong> VB<br />

E-step by using <strong>the</strong> ensemble average of <strong>the</strong> parameters<br />

N<br />

µ D hµi ® . (D.5)<br />

The posterior hyperparameter ® is calculated in <strong>the</strong> VB M-step,<br />

c ® D Thr(x, z)i N µ C c 0h®0ia, (D.6)<br />

c 0h®0ia D c 0<br />

Z<br />

dm (®0)Q a (®0)®0 D 1<br />

b<br />

@Wa<br />

(a, b), (D.7)<br />

@a<br />

toge<strong>the</strong>r withc D T C c 0. The posterior hyper-hyperparameter (a, b) is <strong>the</strong>n<br />

calculated:<br />

a D hµi ® C a0, (D.8)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!