06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

extend this work by using functional data analysis (FDA) to classify facial movement functions into basic emotion categories.<br />

Several single and hybrid classification algorithms are tested. By incorporating action unit co-movement in a Lasso<br />

shrinkage method, we achieved a recognition rate of 89%, substantially outperforming competitor approaches. Application<br />

to real expressions, and introduction of intensity and other temporal features of expressions are discussed as examples of<br />

extensions of our method.<br />

17:00-17:20, Paper ThCT6.5<br />

Multi-Modal Emotion Recognition using Canonical Correlations and Acoustic Features<br />

Gajsek, Rok, Univ. of Ljubljana<br />

Struc, Vitomir, Univ. of Ljubljana<br />

Mihelic, France, Univ. of Ljubljana<br />

The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video<br />

recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the<br />

base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio<br />

sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is<br />

used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does<br />

not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm<br />

fails at detecting the correct area. The system is evaluated on the interface database and the recognition accuracy<br />

of our audio-video fusion is compared to the published results in the literature.<br />

ThCT7 Dolmabahçe Hall C<br />

Multimedia and Document Analysis Applications Regular Session<br />

Session chair: Duygulu Sahin, Pinar (Bilkent Univ.)<br />

15:40-16:00, Paper ThCT7.1<br />

Automatic Music Genre Classification using Bass Lines<br />

Simsekli, Umut, Bogazici Univ.<br />

A bass line is an instrumental melody that encapsulates both rhythmic, melodic, and harmonic features and arguably contains<br />

sufficient information for accurate genre classification. In this paper a bass line based automatic music genre classification<br />

system is described. “Melodic Interval Histograms” are used as features and k-nearest neighbor classifiers are<br />

utilized and compared with SVMs on a small size standard MIDI database. Apart from standard distance metrics for knearest<br />

neighbor (Euclidean, symmetric Kullback-Leibler, earth mover’s, normalized compression distances) we propose<br />

a novel distance metric, perceptually weighted Euclidean distance (PWED). The maximum classification accuracy (84%)<br />

is obtained with k-nearest neighbor classifiers and the added utility of the novel metric is illustrated in our experiments.<br />

16:00-16:20, Paper ThCT7.2<br />

Exploiting Combined Multi-Level Model for Document Sentiment Analysis<br />

Li, Si, Beijing Univ. of Posts and Telecommunications<br />

Zhang, Hao, Beijing Univ. of Posts and Telecommunications<br />

Xu, Weiran, Beijing Univ. of Posts and Telecommunications<br />

Guo, Jun, Beijing Univ. of Posts and Telecommunications<br />

This paper focuses on the task of text sentiment analysis in hybrid online articles and web pages. Traditional approaches<br />

of text sentiment analysis typically work at a particular level, such as phrase, sentence or document level, which might<br />

not be suitable for the documents with too few or too many words. Considering every level analysis has its own advantages,<br />

we expect that a combination model may achieve better performance. In this paper, a novel combined model based on<br />

phrase and sentence level’s analyses and a discussion on the complementation of different levels’ analyses are presented.<br />

For the phrase-level sentiment analysis, a newly defined Left-Middle-Right template and the Conditional Random Fields<br />

are used to extract the sentiment words. The Maximum Entropy model is used in the sentence-level sentiment analysis.<br />

The experiment results verify that the combination model with specific combination of features is better than single level<br />

model.<br />

- 297 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!