Online Model Selection Based on the Variational Bayes
Online Model Selection Based on the Variational Bayes
Online Model Selection Based on the Variational Bayes
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1661<br />
Z<br />
C dm (µ )Qh (µ ) log ¡ P0(µ )/Qh (µ )¢ . (3.2)<br />
The ratio (c 0/ T) determines <strong>the</strong> relative reliability between <strong>the</strong> observed<br />
data and <strong>the</strong> prior belief for <strong>the</strong> parameter distributi<strong>on</strong>. The expected free<br />
energy, equati<strong>on</strong> 3.2, can be estimated by<br />
� ´<br />
T Xt<br />
Z<br />
F(Xft g, Qzftg, Qh , T) D<br />
t<br />
tD1<br />
dm (µ )Q Z<br />
h (µ )<br />
£ log ¡ P(x(t), z(t)| µ )/Qz(z(t)) ¢<br />
Z<br />
C<br />
dm (z(t))Qz(z(t))<br />
dm (µ )Q h (µ ) log ¡ P0(µ )/Q h (µ )¢ , (3.3)<br />
where Qzftg D fQz(z(t))|t D 1, . . . , t g. Note that t represents <strong>the</strong> actual<br />
amount of observed data, and it increases over time while T is �xed. The estimati<strong>on</strong><br />
of <strong>the</strong> posterior distributi<strong>on</strong> Qz(z(t)) is inaccurate in <strong>the</strong> early stage<br />
of <strong>the</strong> <strong>on</strong>line learning and gradually becomes accurate as learning proceeds.<br />
However, <strong>the</strong> early inaccurate estimati<strong>on</strong>s and <strong>the</strong> later accurate estimati<strong>on</strong>s<br />
c<strong>on</strong>tribute to <strong>the</strong> free energy (see equati<strong>on</strong> 3.3) in equal weight. This might<br />
cause slow c<strong>on</strong>vergence of <strong>the</strong> learning process. Therefore, we introduce a<br />
time-dependent discount factor l(t) (0 · l(t) · 1, t D 2, 3, . . .) for forgetting<br />
<strong>the</strong> earlier inaccurate estimati<strong>on</strong> effects. Accordingly, a discounted free<br />
energy is de�ned by<br />
F l (Xft g, Qzft g, Q h , T) D Tg(t )<br />
Z<br />
£<br />
tX<br />
�tY tD1<br />
l(s)<br />
sDtC1<br />
dm (µ )Qh (µ )<br />
Z<br />
´<br />
dm (z(t))Qz(z(t))<br />
£ log ¡ P(x(t), z(t)| µ )/Qz(z(t)) ¢<br />
Z<br />
C<br />
where g(t ) represents a normalizati<strong>on</strong> c<strong>on</strong>stant:<br />
g(t ) D<br />
"<br />
tX �tY tD1<br />
l(s)<br />
sDtC1<br />
´#¡1<br />
dm (µ )Q h (µ ) log ¡ P0(µ )/Q h (µ )¢ , (3.4)<br />
. (3.5)<br />
3.2 <str<strong>on</strong>g>Online</str<strong>on</strong>g> Variati<strong>on</strong>al <strong>Bayes</strong> Algorithm. The <strong>on</strong>line VB algorithm can<br />
be derived from <strong>the</strong> successive maximizati<strong>on</strong> of <strong>the</strong> discounted free energy<br />
(see equati<strong>on</strong> 3.4). Let us assume that Qzft ¡1g D fQz(z(t))|t D 1, . . . , t ¡1g<br />
(µ ) have been determined for an observed data set Xft ¡ 1g D<br />
and Q (t ¡1)<br />
h