Unsupervised Recursive Sequence Processing - Institute of ...

More documents

Recommendations

Info

5.2 Binary automata The second experiment is also inspired by Voegtlin. A discrete 0/1-sequence generated by a binary automaton with P (0|1) = 0.4 and P (1|0) = 0.3 shall be learned. For discrete data, the specialization of a neuron can be defined as the longest sequence that still leads to unambiguous winner selection. A high percentage of specialized neurons indicates that temporal context has been learned by the map. In addition, one can compare the distribution of specializations with the original distribution of strings as generated by the underlying probability. Figure 3 shows the specialization of a trained H-SOM-S. Training has been carried out with 3·10 6 presentations, increasing the context influence (1 − η) exponentially from 0 to 0.06. The remaining parameters have been chosen as in the first experiment. Finally, the receptive field has been computed by providing an additional number of 10 6 test iterations. Putting more emphasis on the context results in a smaller number of active neurons representing rather long strings that cover only a small part of the total input space. If a Euclidean lattice is used instead of a hyperbolic neighborhood, the resulting quantizers differ only slightly, which indicates that the representation of binary symbols and their contexts in the 2-dimensional output space representations does barely benefit from exponential branching. In the depicted run, 64 of the neurons express a clear profile, whereas the other neurons are located at sparse locations of the input data topology, between cluster boundaries, and thus do not win for the presented stimuli. The distribution corresponds nicely to the 100 most characteristic sequences of the probabilistic automaton as indicated by the graph. Unlike RecSOM (presented in [41]), also neurons at interior nodes of the tree are expressed for H-SOM-S. These nodes refer to transient states, which are represented by corresponding winners in the network. RecSOM, in contrast to SOM-S, does not rely on the winner index only, but it uses a more complex representation: since the transient states are spared, longer sequences can be expressed by RecSOM. In addition to the examination of neuron specialization, the whole map 11 10 9 8 7 6 5 4 3 2 1 0 100 most likely sequences H-SOM-S, 100 neurons 64 specialized neurons Fig. 3. Receptive fields of a H-SOM-S compared to the most probable sub-sequences of the binary automaton. Left hand branches denote 0, right is 1. 20
* 6 2 6 5 : 8 : 2 8 5 - Type P (0) P (1) P (0|0) P (1|0) P (0|1) P (1|1) Automaton 1 4/7 ≈ 0.571 3/7 ≈ 0.429 0.7 0.3 0.4 0.6 Map (98/100) 0.571 0.429 0.732 0.268 0.366 0.634 Automaton 2 2/7 ≈ 0.286 5/7 ≈ 0.714 0.8 0.2 0.08 0.92 Map (138/141) 0.297 0.703 0.75 0.25 0.12 0.88 Automaton 3 0.5 0.5 0.5 0.5 0.5 0.5 Map (138/141) 0.507 0.493 0.508 0.492 0.529 0.471 Table 1 Results for binary automata extraction with different transition probabilities. The extracted probabilities clearly follow the original ones. representation can be characterized by comparing the input symbol transition statistics with the learned context-neuron relations. While the current symbol is coded by the winning neuron’s weight, the previous symbol is represented by the average of weights of the winner’s context triangle neurons. The obtained two values – the neuron’s state and the average state of the neuron’s context – are clearly expressed in the trained map: only few neurons contain values in an indeterminate interval [ 1, 2 ], but most neurons specialize on very close to 0 or 1. Results for the reconstruction of three automata can be found in table 1. For the reconstruction we have 3 3 used the algorithm described in section 4.2 with memory length 1. The left column indicates the number of expressed neurons and the total number of neurons in the map. Note that the automata can be well reobtained from the trained maps. Again, the temporal dependencies are clearly captured by the maps. 5.3 Reber grammar In a third experiment we have used more structured symbolic sequences as generated by the Reber grammar illustrated in figure 4. The 7 symbols have been coded in a 6-dimensional Euclidean space by points that denote the same as a tetrahedron does with its four corners in three dimensions: all points have the same distance Fig. 4. State graph of the Reber grammar. 21
Page 1 and 2: Unsupervised Recursive Sequence Pro
Page 3 and 4: This framework directly generalizes
Page 5 and 6: place by the update rule △w j =
Page 7 and 8: ecursive partitioning very much lik
Page 9 and 10: However, the dimensionality of the
Page 11 and 12: In the following, we focus on the c
Page 13 and 14: , ! ! , Fig. 1. Hyperbolic
Page 15 and 16: with reverse indexing notation, i.e
Page 17 and 18: The number of specialized neurons f
Page 19: to Voegtlin’s results, we observe
Page 23 and 24: TVVEBTSSX SEBTSSX VVEBTXX EBTSSSX E
Page 25 and 26: follows: a stands for (0, 0) + µ,
Page 27 and 28: 6 Conclusions We have presented a s
Page 29 and 30: [18] S. Kaski, T. Honkela, K. Lagus

Unsupervised Recursive Sequence Processing - Institute of ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?