12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

78 Socially Intelligent Agentsand suggested theories (reviews of about 60 years of research can be found in[2, 11]). On the other hand, AI researchers have made contributions in the followingareas: emotional speech synthesis [3, 9], recognition of emotions [5],and using agents for decoding and expressing emotions [12].2. MotivationThe project is motivated by the question of how recognition of emotionsin speech could be used for business. A potential application is the detectionof the emotional state in telephone call center conversations, and providingfeedback to an operator or a supervisor for monitoring purposes. Another applicationis sorting voice mail messages according to the emotions expressedby the caller.Given this orientation, for this study we solicited data from people who arenot professional actors or actresses. We have focused on negative emotions likeanger, sadness and fear. We have targeted telephone quality speech (less than3.4 kHz) and relied on voice signal only. This means that we have excludedmodern speech recognition techniques. There are several reasons to do this.First, in speech recognition emotions are considered as noise that decreasesthe accuracy of recognition. Second, although it is true that some words andphrases are correlated with particular emotions, the situation usually is muchmore complex and the same word or phrase can express the whole spectrum ofemotions. Third, speech recognition techniques require much better quality ofsignal and computational power.To achieve our objectives we decided to proceed in two stages: research anddevelopment. The objectives of the first stage are to learn how well people recognizeemotions in speech, to find out which features of speech signal couldbe useful for emotion recognition, and to explore different mathematical modelsfor creating reliable recognizers. The second stage objective is to create areal-time recognizer for call center applications.3. ResearchFor the first stage we had to create and evaluate a corpus of emotional data,evaluate the performance of people, and select data for machine learning. Wedecided to use high quality speech data for this stage.3.1 Corpus of Emotional DataWe asked thirty of our colleagues to record the following four short sentences:“This is not what I expected”, “I’ll be right there”, “Tomorrow is mybirthday”, and “I’m getting married next week.” Each sentence was recordedby every subject five times; each time, the subject portrayed one of the follow-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!