24.12.2012 Views

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1661<br />

Z<br />

C dm (µ )Qh (µ ) log ¡ P0(µ )/Qh (µ )¢ . (3.2)<br />

The ratio (c 0/ T) determines <strong>the</strong> relative reliability between <strong>the</strong> observed<br />

data and <strong>the</strong> prior belief for <strong>the</strong> parameter distributi<strong>on</strong>. The expected free<br />

energy, equati<strong>on</strong> 3.2, can be estimated by<br />

� ´<br />

T Xt<br />

Z<br />

F(Xft g, Qzftg, Qh , T) D<br />

t<br />

tD1<br />

dm (µ )Q Z<br />

h (µ )<br />

£ log ¡ P(x(t), z(t)| µ )/Qz(z(t)) ¢<br />

Z<br />

C<br />

dm (z(t))Qz(z(t))<br />

dm (µ )Q h (µ ) log ¡ P0(µ )/Q h (µ )¢ , (3.3)<br />

where Qzftg D fQz(z(t))|t D 1, . . . , t g. Note that t represents <strong>the</strong> actual<br />

amount of observed data, and it increases over time while T is �xed. The estimati<strong>on</strong><br />

of <strong>the</strong> posterior distributi<strong>on</strong> Qz(z(t)) is inaccurate in <strong>the</strong> early stage<br />

of <strong>the</strong> <strong>on</strong>line learning and gradually becomes accurate as learning proceeds.<br />

However, <strong>the</strong> early inaccurate estimati<strong>on</strong>s and <strong>the</strong> later accurate estimati<strong>on</strong>s<br />

c<strong>on</strong>tribute to <strong>the</strong> free energy (see equati<strong>on</strong> 3.3) in equal weight. This might<br />

cause slow c<strong>on</strong>vergence of <strong>the</strong> learning process. Therefore, we introduce a<br />

time-dependent discount factor l(t) (0 · l(t) · 1, t D 2, 3, . . .) for forgetting<br />

<strong>the</strong> earlier inaccurate estimati<strong>on</strong> effects. Accordingly, a discounted free<br />

energy is de�ned by<br />

F l (Xft g, Qzft g, Q h , T) D Tg(t )<br />

Z<br />

£<br />

tX<br />

�tY tD1<br />

l(s)<br />

sDtC1<br />

dm (µ )Qh (µ )<br />

Z<br />

´<br />

dm (z(t))Qz(z(t))<br />

£ log ¡ P(x(t), z(t)| µ )/Qz(z(t)) ¢<br />

Z<br />

C<br />

where g(t ) represents a normalizati<strong>on</strong> c<strong>on</strong>stant:<br />

g(t ) D<br />

"<br />

tX �tY tD1<br />

l(s)<br />

sDtC1<br />

´#¡1<br />

dm (µ )Q h (µ ) log ¡ P0(µ )/Q h (µ )¢ , (3.4)<br />

. (3.5)<br />

3.2 <str<strong>on</strong>g>Online</str<strong>on</strong>g> Variati<strong>on</strong>al <strong>Bayes</strong> Algorithm. The <strong>on</strong>line VB algorithm can<br />

be derived from <strong>the</strong> successive maximizati<strong>on</strong> of <strong>the</strong> discounted free energy<br />

(see equati<strong>on</strong> 3.4). Let us assume that Qzft ¡1g D fQz(z(t))|t D 1, . . . , t ¡1g<br />

(µ ) have been determined for an observed data set Xft ¡ 1g D<br />

and Q (t ¡1)<br />

h

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!