31.01.2014 Views

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

sequence s is given, with s i denoting the current entry and n j0 denoting the best<br />

matching neuron for this time step. Then the weight correction term is<br />

△w j = ɛ · h σ (nhd(n j0 , n j )) · (s i − w j )<br />

As discussed in [23], the learning rule <strong>of</strong> TKM is unstable and leads to only suboptimal<br />

results. More advanced, the Recurrent SOM (RSOM) leaky integration first<br />

sums up the weighted directions and afterwards computes the distance [39]<br />

t∑<br />

2<br />

d RSOM (s, n j ) =<br />

η(1 − η) i−1 (s<br />

∥<br />

i − w j )<br />

.<br />

∥<br />

i=1<br />

It represents the context in a larger space than TKM since the vectors <strong>of</strong> directions<br />

are stored instead <strong>of</strong> the scalar Euclidean distance. More importantly, the training<br />

rule is changed. RSOM derives its learning rule directly from the objective to minimize<br />

the distortion error on sequences and thus adapts the weights towards the<br />

vector <strong>of</strong> integrated directions:<br />

△w j = ɛ · h σ (nhd(n j0 , n j )) · y j (i)<br />

whereby<br />

y j (i) =<br />

t∑<br />

η(1 − η) i−1 (s i − w j ) .<br />

i=1<br />

Again, the already processed part <strong>of</strong> the sequence produces a context notion, and<br />

the neuron becomes the winner for the current entry <strong>of</strong> which the weight is most<br />

similar to the average entry for the past time steps. The training rule <strong>of</strong> RSOM takes<br />

this fact into account by adapting the weights towards this averaged activation.<br />

We will not refer to this learning rule in the following. Instead, the way in which<br />

sequences are represented within these two models, and the ways to improve the<br />

representational capabilities <strong>of</strong> such maps will be <strong>of</strong> interest.<br />

Assuming vanishing neighborhood influences σ for both cases TKM and RSOM,<br />

one can analytically compute the internal representation <strong>of</strong> sequences for these two<br />

models, i.e. weights with response optimum to a given sequence s = (s 1 , . . . , s t ):<br />

the weight w is optimum for which<br />

t∑<br />

t∑<br />

w = (1 − η) i−1 s i / (1 − η) i−1<br />

i=1<br />

i=1<br />

holds [40]. This explains the encoding scheme <strong>of</strong> the winner-takes-all dynamics<br />

<strong>of</strong> TKM and RSOM. <strong>Sequence</strong>s are encoded in the weight space by providing a<br />

6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!