24.12.2012 Views

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1657<br />

appendix B)<br />

� 1<br />

c<br />

´ �<br />

(@F/@® ) V® ,® D<br />

(@F/@c )<br />

, V ® ,c<br />

V T<br />

® ,c ´<br />

, Vc ,c<br />

� ´<br />

Thr(x, z)i N C c 0®0 ¡ (T C c 0)®<br />

£ µ , (2.30)<br />

T C c 0 ¡ c<br />

where <strong>the</strong> Fisher informati<strong>on</strong> matrix V for <strong>the</strong> posterior parameter distributi<strong>on</strong><br />

P a (µ | ®, c ) is given by<br />

1<br />

V ® ,® D<br />

c 2<br />

½ � ´ �<br />

@ log Pa @ log Pa @® @® T<br />

´¾<br />

®<br />

D<br />

D (µ ¡ hµi ® )(µ ¡ hµi ® ) TE<br />

® ,<br />

V ® ,c D 1<br />

½ � ´ � ´¾<br />

@ log Pa @ log Pa c @® @c ®<br />

D « (µ ¡ hµi ® )(g(µ ) ¡ hg(µ )i ® )¬<br />

® ,<br />

½ � ´ � ´¾<br />

@ log Pa @ log Pa Vc ,c D<br />

@c @c ®<br />

D « (g(µ ) ¡ hg(µ )i ® )(g(µ ) ¡ hg(µ )i ® )¬<br />

® ,<br />

g(µ ) D ® ¢ µ ¡ Ã (µ ). (2.31)<br />

Since <strong>the</strong> Fisher informati<strong>on</strong> matrix V is positive de�nite, <strong>the</strong> free energy<br />

maximizati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>, 2.29, leads to <strong>the</strong> VB M-step equati<strong>on</strong>s, 2.17 and<br />

2.18. From equati<strong>on</strong> 2.30, it is shown that <strong>the</strong> VB M-step soluti<strong>on</strong> is a maximum<br />

of <strong>the</strong> free energy with respect to (®, c ), as in <strong>the</strong> VB-E step.<br />

The VB algorithm is summarized as follows. First, c is set to (T C c 0).<br />

In <strong>the</strong> VB E-step, <strong>the</strong> ensemble average of <strong>the</strong> parameter N µ is calculated by<br />

using equati<strong>on</strong> 2.20. Subsequently, <strong>the</strong> expectati<strong>on</strong> value of <strong>the</strong> suf�cient<br />

statistics hr(x, z)i N µ , 2.19, is calculated by using <strong>the</strong> posterior distributi<strong>on</strong> for<br />

<strong>the</strong> hidden variable P(z(t)|x(t), N µ ), equati<strong>on</strong> 2.14. In <strong>the</strong> VB M-step, <strong>the</strong> posterior<br />

hyperparameter ® is updated by using equati<strong>on</strong> 2.18. Repeating this<br />

process, <strong>the</strong> free energy functi<strong>on</strong>, equati<strong>on</strong> 2.26, increases m<strong>on</strong>ot<strong>on</strong>ically.<br />

This process c<strong>on</strong>tinues until <strong>the</strong> free energy functi<strong>on</strong> c<strong>on</strong>verges.<br />

Using equati<strong>on</strong>s 2.28 and 2.30, VB equati<strong>on</strong>s 2.20 and 2.18 can be expressed<br />

as <strong>the</strong> gradient method:<br />

D N µ D N µnew ¡ N µ D hµi ® ¡ N µ<br />

D U ¡1 ( µ<br />

N ) @F<br />

@N (XfTg,<br />

µ<br />

N µ, ®, c ), (2.32)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!