24.12.2012 Views

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1659<br />

By interchanging <strong>the</strong> integrati<strong>on</strong> with respect to µ and z, <strong>on</strong>e can get<br />

Z<br />

P(x|XfTg) D dm (z)<br />

£ exp £ r0(x, z) C © ( O® (x, z), c C 1) ¡ © (®, c ) ¤ , (2.37)<br />

O® (x, z) D (c ® C r(x, z))/(1 C c ).<br />

For a �nite T, this predictive distributi<strong>on</strong> has a different functi<strong>on</strong>al form<br />

from <strong>the</strong> model distributi<strong>on</strong> P(x| µ ), equati<strong>on</strong> 2.1.<br />

2.7 Large Sample Limit. When <strong>the</strong> amount of observed data becomes<br />

large (T À 1 : c À 1), <strong>the</strong> soluti<strong>on</strong> of <strong>the</strong> VB algorithm becomes <strong>the</strong> ML<br />

estimator (Attias, 1999). In this limit, <strong>the</strong> integrati<strong>on</strong> over <strong>the</strong> parameters<br />

with respect to <strong>the</strong> posterior parameter distributi<strong>on</strong> can be approximated<br />

by using a stati<strong>on</strong>ary point approximati<strong>on</strong>:<br />

Z<br />

exp [© (®, c )] D<br />

dm (µ ) exp [c (a ¢ µ ¡ Ã (µ ))]<br />

µ<br />

� exp c (a ¢ O µ ¡ Ã ( µ<br />

O ) ¡ 1<br />

2 log<br />

�<br />

�<br />

�<br />

�c @2Ã @µ@µ ( �<br />

µ<br />

O<br />

�<br />

) �<br />

� C O(1/c <br />

) , (2.38)<br />

where O µ is <strong>the</strong> maximum of <strong>the</strong> exp<strong>on</strong>ent ¡ ® ¢ µ ¡ Ã (µ )¢ , that is,<br />

@Ã<br />

@µ ( O µ ) D ®. (2.39)<br />

Therefore, © can be approximated as<br />

© (®, c ) � c (a ¢ O µ ¡ Ã ( µ<br />

O ) ¡ 1<br />

2 log<br />

�<br />

�<br />

�<br />

�c @2Ã @µ@µ ( �<br />

µ<br />

O<br />

�<br />

) �<br />

� C O(1/c ). (2.40)<br />

C<strong>on</strong>sequently, <strong>the</strong> ensemble average of <strong>the</strong> parameter N µ can be approximated<br />

as<br />

Nµ D 1<br />

c<br />

� 1<br />

c<br />

@©<br />

@® (®, c )<br />

@<br />

@® (c (® ¢ O µ ¡ Ã ( O µ ))) D O µ. (2.41)<br />

The relati<strong>on</strong>s 2.39 and 2.41 imply that <strong>the</strong> posterior hyperparameter ® is<br />

equal to <strong>the</strong> expectati<strong>on</strong> parameter of <strong>the</strong> EFH model, Á (see equati<strong>on</strong> 2.3)<br />

in this limit. Fur<strong>the</strong>rmore, equati<strong>on</strong>s 2.18, 2.39, and 2.41 are equivalent to <strong>the</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!