31.01.2014 Views

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

In a trained map, neurons spread in regions <strong>of</strong> the data space where a high sample<br />

density can be observed, resulting in large U-values at borders between clusters.<br />

Consequently, the U-Matrix forms a 3D landscape on the lattice <strong>of</strong> neurons with<br />

valleys corresponding to meaningful clusters and hills at the cluster borders. The<br />

U-Matrix <strong>of</strong> weight vectors can be constructed also for SOM-S. Based on this matrix,<br />

the sequence entries can be clustered into meaningful categories, based on<br />

which the extraction <strong>of</strong> Markov models as described above is possible. Note that<br />

the U-Matrix is built by using the weights assigned to the neurons only, while the<br />

context information <strong>of</strong> SOM-S is yet ignored. 6 However, since context information<br />

is used for training, clusters emerge which are meaningful with respect to the<br />

temporal structure, and this way they contribute implicitly to the topological ordering<br />

<strong>of</strong> the map and to the U-Matrix. Partially overlapping, noisy, and ambiguous<br />

input elements are separated during the training, because the different temporal<br />

contexts contain enough information to activate and produce characteristic clusters<br />

on the map. Thus, the temporal structure captured by the training allows a reliable<br />

reconstruction <strong>of</strong> the input sequences, which could not have been achieved by the<br />

standard SOM architecture.<br />

5 Experiments<br />

5.1 Mackey-Glass time series<br />

The first task is to learn the dynamic <strong>of</strong> the real-valued chaotic Mackey-Glass time<br />

series dx = bx(τ) + ax(τ−d) using a = 0.2, b = −0.1, d = 17. This is the same<br />

dτ 1+x(τ−d) 10<br />

setup as given in [41] making a comparison <strong>of</strong> the results possible. 7 Three types<br />

<strong>of</strong> maps with 100 neurons have been trained: a 6-neighbor map without context<br />

giving standard SOM, a map with 6 neighbors and with context (SOM-S), and<br />

a 7-neighbor map providing a hyperbolic grid with context utilization (H-SOM-<br />

S). Each run has been computed with 1.5 · 10 5 presentations starting at random<br />

positions within the Mackey-Glass series using a sample period <strong>of</strong> ∆t = 3; the<br />

neuron weights have been initialized white within [0.6, 1.4]. The context has been<br />

considered by decreasing the parameter from η = 1 to η = 0.97. The learning rate<br />

is exponentially decreased from 0.1 to 0.005 for weight and context update. Initial<br />

neighborhood cooperativity is 10 which is annealed to 1 during training.<br />

Figure 2 shows the temporal quantization error for the above setups: the temporal<br />

quantization error is expressed by the average standard deviation <strong>of</strong> the given sequence<br />

and the mean unit receptive field for 29 time steps into the past. Similar<br />

6 Preliminary experiments indicate that the context also orders topologically and yields<br />

meaningful clusters. The number <strong>of</strong> neurons in context clusters is thereby small compared<br />

to the number <strong>of</strong> neurons and statistically significant results could not be obtained.<br />

7 We would like to thank T.Voegtlin for providing data for comparison.<br />

18

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!