31.01.2014 Views

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

Unsupervised Recursive Sequence Processing - Institute of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Training is carried out by presenting a pattern s = (s 1 , . . . , s t ), determining the<br />

winner n j0 , and updating the weight and the context. Adaptation affects all neurons<br />

on the breadth first search graph around the winning neuron according to their<br />

grid distances in a Hebbian style. Hence, for the sequence entry s i , weight w j is<br />

updated by △w j = ɛ · h σ (nhd(n j0 , n j )) · (s i − w j ). The learning rate ɛ is typically<br />

exponentially decreased during training; as above, h σ (nhd(n j0 , n j )) describes the<br />

influence <strong>of</strong> the winner n j0 to the current neuron n j as a decreasing function <strong>of</strong><br />

grid distance. The context update is analogous: the current context, expressed in<br />

terms <strong>of</strong> neuron triangle corners and coordinates, is moved towards the previous<br />

winner along a shortest path. This adaptation yields positions on the grid only.<br />

Intermediate positions can be achieved by interpolation: if two neurons N i and N j<br />

exist in the triangle with the same distance, the midway is taken for the flat grids<br />

obtained by our grid generator. This explains why the update path, depicted as the<br />

dotted line in figure 1, for the current context towards D 2 is via D 1 . Since the grid<br />

distances are stored in a static matrix, a fast calculation <strong>of</strong> shortest path lengths is<br />

possible. The parameter η in the recursive distance calculations controls the balance<br />

between pattern and context influence; since initially nothing is known about the<br />

temporal structure, this parameter starts at 1, thus indicating the absence <strong>of</strong> context,<br />

and resulting in standard SOM. During training it is decreased to an application<br />

dependent value that mediates the balance between the externally presented pattern<br />

and the internally gained model about historic contexts.<br />

Thus, we can combine the flexibility <strong>of</strong> general triangular and possibly hyperbolic<br />

lattice structures with the efficient context representation as proposed in [11].<br />

4 Evaluation measures <strong>of</strong> SOM<br />

Popular methods to evaluate the standard SOM are the visual inspection, the identification<br />

<strong>of</strong> meaningful clusters, the quantization error, and measures for topological<br />

ordering <strong>of</strong> the map. For recursive self organizing maps, an additional dimension<br />

arises: the temporal dynamic stored in the context representations <strong>of</strong> the map.<br />

4.1 Temporal quantization error<br />

Using ideas <strong>of</strong> Voegtlin [41] we introduce a method to assess the implicit representation<br />

<strong>of</strong> temporal dependencies in the map, and to evaluate to which amount<br />

faithful representation <strong>of</strong> the temporal data takes place. The general quantization<br />

error refers to the distortion <strong>of</strong> each map unit with respect to its receptive field,<br />

which measures the extent <strong>of</strong> data space coverage by the units. If temporal data are<br />

considered, the distortion needs to be assessed back in time. For a formal definition,<br />

assume that a time series (s 1 , s 2 , . . . , s t , . . .) is presented to the network, again<br />

14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!