Unsupervised Recursive Sequence Processing - Institute of ...
Unsupervised Recursive Sequence Processing - Institute of ...
Unsupervised Recursive Sequence Processing - Institute of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
sequence s is given, with s i denoting the current entry and n j0 denoting the best<br />
matching neuron for this time step. Then the weight correction term is<br />
△w j = ɛ · h σ (nhd(n j0 , n j )) · (s i − w j )<br />
As discussed in [23], the learning rule <strong>of</strong> TKM is unstable and leads to only suboptimal<br />
results. More advanced, the Recurrent SOM (RSOM) leaky integration first<br />
sums up the weighted directions and afterwards computes the distance [39]<br />
t∑<br />
2<br />
d RSOM (s, n j ) =<br />
η(1 − η) i−1 (s<br />
∥<br />
i − w j )<br />
.<br />
∥<br />
i=1<br />
It represents the context in a larger space than TKM since the vectors <strong>of</strong> directions<br />
are stored instead <strong>of</strong> the scalar Euclidean distance. More importantly, the training<br />
rule is changed. RSOM derives its learning rule directly from the objective to minimize<br />
the distortion error on sequences and thus adapts the weights towards the<br />
vector <strong>of</strong> integrated directions:<br />
△w j = ɛ · h σ (nhd(n j0 , n j )) · y j (i)<br />
whereby<br />
y j (i) =<br />
t∑<br />
η(1 − η) i−1 (s i − w j ) .<br />
i=1<br />
Again, the already processed part <strong>of</strong> the sequence produces a context notion, and<br />
the neuron becomes the winner for the current entry <strong>of</strong> which the weight is most<br />
similar to the average entry for the past time steps. The training rule <strong>of</strong> RSOM takes<br />
this fact into account by adapting the weights towards this averaged activation.<br />
We will not refer to this learning rule in the following. Instead, the way in which<br />
sequences are represented within these two models, and the ways to improve the<br />
representational capabilities <strong>of</strong> such maps will be <strong>of</strong> interest.<br />
Assuming vanishing neighborhood influences σ for both cases TKM and RSOM,<br />
one can analytically compute the internal representation <strong>of</strong> sequences for these two<br />
models, i.e. weights with response optimum to a given sequence s = (s 1 , . . . , s t ):<br />
the weight w is optimum for which<br />
t∑<br />
t∑<br />
w = (1 − η) i−1 s i / (1 − η) i−1<br />
i=1<br />
i=1<br />
holds [40]. This explains the encoding scheme <strong>of</strong> the winner-takes-all dynamics<br />
<strong>of</strong> TKM and RSOM. <strong>Sequence</strong>s are encoded in the weight space by providing a<br />
6