24.12.2012 Views

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

Online Model Selection Based on the Variational Bayes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Online</str<strong>on</strong>g> <str<strong>on</strong>g>Model</str<strong>on</strong>g> <str<strong>on</strong>g>Selecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> Variati<strong>on</strong>al <strong>Bayes</strong> 1667<br />

This method is used in a model selecti<strong>on</strong> task for dynamic envir<strong>on</strong>ments in<br />

secti<strong>on</strong> 5.<br />

In <strong>the</strong> next secti<strong>on</strong>, we study <strong>the</strong> model selecti<strong>on</strong> problem for mixture<br />

of gaussian models. As a mechanism for structural change, we adopt <strong>the</strong><br />

split-and-merge method proposed by Ueda et al. (1999) (see also Richards<strong>on</strong><br />

& Green, 1997; Ghahramani & Beal, 2000; Ueda, 1999). For mixture models,<br />

<strong>the</strong> split-and-merge method provides a simple procedure for structural<br />

changes. We choose ei<strong>the</strong>r to split a unit into two or to merge two units<br />

into <strong>on</strong>e in <strong>the</strong> sequential model selecti<strong>on</strong> process. In <strong>the</strong> current implementati<strong>on</strong>,<br />

<strong>the</strong> same process is applied if <strong>the</strong> previous attempt was successful.<br />

O<strong>the</strong>rwise <strong>the</strong> o<strong>the</strong>r process is applied.<br />

A criteri<strong>on</strong> for splitting a unit is given by <strong>the</strong> unit’s free energy, which is<br />

assigned to each unit (see appendix C). The split is applied to <strong>the</strong> unit with<br />

<strong>the</strong> lowest free energy am<strong>on</strong>g unattempted units. A criteri<strong>on</strong> for merging<br />

units is given by <strong>the</strong> correlati<strong>on</strong> between <strong>the</strong> two units’ activities, which are<br />

represented by <strong>the</strong> posterior probability that <strong>the</strong> units will be selected for<br />

given data. The unit pair with <strong>the</strong> highest correlati<strong>on</strong> am<strong>on</strong>g unattempted<br />

unit pairs is selected for merging. The deleti<strong>on</strong> of units is also performed<br />

for units with very small activities, which indicate that <strong>the</strong> units have not<br />

been selected at all.<br />

We adopted <strong>the</strong> above model selecti<strong>on</strong> procedure because of its simplicity.<br />

O<strong>the</strong>r model selecti<strong>on</strong> procedures using <strong>the</strong> split-and-merge algorithm<br />

have also been proposed (Ghahramani & Beal, 2000; Ueda, 1999).<br />

By combining <strong>the</strong> sequential model selecti<strong>on</strong> procedure with <strong>the</strong> <strong>on</strong>line<br />

VB learning method, a fully <strong>on</strong>line learning method with a model selecti<strong>on</strong><br />

mechanism is obtained, and it can be applied to real-time applicati<strong>on</strong>s.<br />

5 Experiments<br />

As a preliminary study <strong>on</strong> <strong>the</strong> performance of <strong>the</strong> <strong>on</strong>line VB method, we c<strong>on</strong>sidered<br />

model selecti<strong>on</strong> problems for two-dimensi<strong>on</strong>al mixture of gaussian<br />

(MG) models (see appendix C). We borrowed two tasks from Roberts et al.<br />

(1998). Data set A, c<strong>on</strong>sisting of 200 points, was generated from a mixture<br />

of four gaussians with <strong>the</strong> centers (0, 0), (2, p 12), (4, 0), and (¡2, ¡ p 12)<br />

(see Figure 1A). The gaussians had <strong>the</strong> same isotropic variance s 2 D (1.2) 2 .<br />

In additi<strong>on</strong>, data set B, c<strong>on</strong>sisting of 1000 points, was generated from a<br />

mixture of four gaussians (see Figure 1B). In this case, <strong>the</strong>y were paired<br />

such that each pair had a comm<strong>on</strong> center—m1 D m2 D (2, p 12) and<br />

m3 D m4 D (¡2, ¡ p 12)—but different variances—s 2 1 D s2 3 D (1.0)2 and<br />

s 2 2 D s2 4 D (5.0)2 . Although <strong>the</strong>se models were simple, <strong>the</strong> model selecti<strong>on</strong><br />

tasks for <strong>the</strong>m were ra<strong>the</strong>r dif�cult because of <strong>the</strong> overlap between <strong>the</strong><br />

gaussians (Roberts et al., 1998).<br />

In <strong>the</strong> �rst experiment, we examined <strong>the</strong> usual <strong>Bayes</strong> model selecti<strong>on</strong><br />

procedure. A set of models c<strong>on</strong>sisting of different numbers of units was

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!