Unsupervised Recursive Sequence Processing - Institute of ...

More documents

Recommendations

Info

Training is carried out by presenting a pattern s = (s 1 , . . . , s t ), determining the winner n j0 , and updating the weight and the context. Adaptation affects all neurons on the breadth first search graph around the winning neuron according to their grid distances in a Hebbian style. Hence, for the sequence entry s i , weight w j is updated by △w j = ɛ · h σ (nhd(n j0 , n j )) · (s i − w j ). The learning rate ɛ is typically exponentially decreased during training; as above, h σ (nhd(n j0 , n j )) describes the influence of the winner n j0 to the current neuron n j as a decreasing function of grid distance. The context update is analogous: the current context, expressed in terms of neuron triangle corners and coordinates, is moved towards the previous winner along a shortest path. This adaptation yields positions on the grid only. Intermediate positions can be achieved by interpolation: if two neurons N i and N j exist in the triangle with the same distance, the midway is taken for the flat grids obtained by our grid generator. This explains why the update path, depicted as the dotted line in figure 1, for the current context towards D 2 is via D 1 . Since the grid distances are stored in a static matrix, a fast calculation of shortest path lengths is possible. The parameter η in the recursive distance calculations controls the balance between pattern and context influence; since initially nothing is known about the temporal structure, this parameter starts at 1, thus indicating the absence of context, and resulting in standard SOM. During training it is decreased to an application dependent value that mediates the balance between the externally presented pattern and the internally gained model about historic contexts. Thus, we can combine the flexibility of general triangular and possibly hyperbolic lattice structures with the efficient context representation as proposed in [11]. 4 Evaluation measures of SOM Popular methods to evaluate the standard SOM are the visual inspection, the identification of meaningful clusters, the quantization error, and measures for topological ordering of the map. For recursive self organizing maps, an additional dimension arises: the temporal dynamic stored in the context representations of the map. 4.1 Temporal quantization error Using ideas of Voegtlin [41] we introduce a method to assess the implicit representation of temporal dependencies in the map, and to evaluate to which amount faithful representation of the temporal data takes place. The general quantization error refers to the distortion of each map unit with respect to its receptive field, which measures the extent of data space coverage by the units. If temporal data are considered, the distortion needs to be assessed back in time. For a formal definition, assume that a time series (s 1 , s 2 , . . . , s t , . . .) is presented to the network, again 14
with reverse indexing notation, i.e. s 1 is the most recent entry of the time series. Let win i denote all time steps for which neuron i becomes the winner in the considered recursive map model. The mean activation of neuron i for time step t in the past is the value A i (t) = ∑ s j+t /|win i |. j∈win i Assume that neuron i becomes winner for a sequence entry s j . It can then be expected that s j is like the standard SOM close to the average A i (0), because the map is trained with Hebbian learning. Temporal specification takes place if, in addition, s j+t is close to the average A i (t) for t > 0. The temporal quantization error of neuron i at time step t back in the past is defined by ⎛ ⎞ E i (t) = ⎝ ∑ ‖s j+t − A i (t)‖ 2 ⎠ j∈win i 1/2 . This measures the extent up to which the values observed t time steps back in the past coincide with a winning neuron. Temporal specialization of neuron i takes place if E i (t) is small for t > 0. Since no temporal context is learned for the standard SOM, the temporal quantization will be large for t > 0, just reflecting specifics of the underlying time series such as smoothness or periodicity. For recursive models, this quantity allows to assess the amount of temporal specification. The temporal quantization error of the entire map for t time steps back into the past is defined as the average N∑ E(t) = E i (t)/N i=1 This method allows to evaluate whether the temporal dynamic in the recent past is faithfully represented. 4.2 Temporal models After the training of a recursive map, it can be used to obtain an explicit, possibly approximative description of the underlying global temporal dynamics. This offers another possibility to evaluate the dynamics of SOM because we can compare the extracted temporal model to the original one, if available, or a temporal model extracted directly from the data. In addition, a compressed description of the global dynamics extracted from a trained SOM is interesting for data mining tasks. In particular, it can be tested whether clustering properties of SOM, referred to by U-matrix methods, transfer to the temporal domain. 15
Page 1 and 2: Unsupervised Recursive Sequence Pro
Page 3 and 4: This framework directly generalizes
Page 5 and 6: place by the update rule △w j =
Page 7 and 8: ecursive partitioning very much lik
Page 9 and 10: However, the dimensionality of the
Page 11 and 12: In the following, we focus on the c
Page 13: , ! ! , Fig. 1. Hyperbolic
Page 17 and 18: The number of specialized neurons f
Page 19 and 20: to Voegtlin’s results, we observe
Page 21 and 22: * 6 2 6 5 : 8 : 2 8 5 - Type P (0)
Page 23 and 24: TVVEBTSSX SEBTSSX VVEBTXX EBTSSSX E
Page 25 and 26: follows: a stands for (0, 0) + µ,
Page 27 and 28: 6 Conclusions We have presented a s
Page 29 and 30: [18] S. Kaski, T. Honkela, K. Lagus

Unsupervised Recursive Sequence Processing - Institute of ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?