12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

84 Socially Intelligent Agents5. The top 14 features are: F0 maximum, F0 standard deviation, F0 range, F0 mean, BW1 mean, BW2mean, energy standard deviation, speaking rate, F0 slope, F1 maximum, energy maximum, energy range,F2 range, and F1 range.6. The first set included the top 8 features (from F0 maximum to speaking rate), the second extendedthe first by the next 2 features (F0 slope and F1 maximum), and the third included all 14 top features.7. An ensemble consists of an odd number of neural network classifiers trained on different subsets.The ensemble makes a decision based on the majority voting principle.8. To train the experts, we used a two-layer backpropagation neural network architecture with a 8-element input vector, 10 or 20 nodes in the hidden sigmoid layer and one node in the output linear layer.We also used the same subsets of the s70 data set as training and test sets but with only two classes (forexample, angry – non-angry).9. To explore this approach, we used a two-layer backpropagation neural network architecture with a5-element input vector, 10 or 20 nodes in the hidden sigmoid layer and five nodes in the output linear layer.We selected five of the best experts and generated several dozens neural network recognizers.10. We created ensembles of 15 neural network recognizers for the 8-,10-, and 14-feature inputs andthe 10- and 20-node architectures. The average accuracy of the ensembles of recognizers lies in the range73–77% and achieves its maximum ∼77% for the 8-feature input and 10-node architecture.References[1] R. Banse and K.R. Scherer. Acoustic profiles in vocal emotion expression. Journal ofPersonality and Social Psychology, 70: 614–636, 1996.[2] R. van Bezooijen. The characteristics and recognizability of vocal expression of emotions.Foris, Drodrecht, The Netherlands, 1984.[3] J.E. Cahn. Generation of Affect in Synthesized Speech. In Proc. 1989 Conference ofthe American Voice I/O Society, pages 251–256. Newport Beach, CA, September 11–13,1989.[4] C. Darwin. The expression of the emotions in man and animals. University of ChicagoPress, 1965 (Original work published in 1872).[5] F. Dellaert, T. Polzin, and A. Waibel. Recognizing emotions in speech. In Proc. Intl. Conf.on Spoken Language Processing, pages 734–737. Philadelphia, PA, October 3–6, 1996.[6] C. Elliot and J. Brzezinski. Autonomous Agents as Synthetic Characters. AI Magazine,19: 13–30, 1998.[7] L. Hansen and P. Salomon. Neural Network Ensembles. IEEE Transactions on PatternAnalysis and Machine Intelligence. 12: 993–1001, 1990.[8] I. Kononenko. Estimating attributes: Analysis and extension of RELIEF. In L. De Raedtand F. Bergadano, editors, Proc. European Conf. On Machine Learning (ECML’94),pages 171–182. Catania, Italy, April 6–8, 1994.[9] I.R. Murray and J.L. Arnott. Toward the simulation of emotion in synthetic speech: Areview of the literature on human vocal emotions. J. Acoust. Society of America, 93(2):1097–1108, 1993.[10] R. Picard. Affective computing. MIT Press, Cambridge, MA, 1997.[11] K.R. Scherer, R. Banse, H.G. Wallbott, and T. Goldbeck. Vocal clues in emotion encodingand decoding. Motivation and Emotion, 15: 123–148, 1991.[12] N. Tosa and R. Nakatsu. Life-like communication agent: Emotion sensing character“MIC” and feeling session character “MUSE”. In Proc. Third IEEE Intl. Conf. on MultimediaComputing and Systems, pages 12–19. Hiroshima, Japan, June 17–23, 1996.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!