21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1) Tempo: also called “rhythmic periodicity,” is estimated from both a spectral analysis of<br />

each b<strong>and</strong> of the spectrogram <strong>and</strong> the assessment of autocorrelation in the amplitude<br />

envelope extracted from the <strong>audio</strong>.<br />

2) Beat Histogram: a histogram representing distribution of tempi in a <strong>music</strong>al excerpt.<br />

Usually the most prominent peak corresponds to the best tempo match. Experiments<br />

often use several properties of the beat histogram such as the bpm (beats per minute)<br />

values of the two highest peaks <strong>and</strong> the sum of all histogram bins, etc.<br />

2.3.2 Text-based Music Mood Classification<br />

Very recently, several studies on <strong>music</strong> <strong>mood</strong> <strong>classification</strong> have been conducted <strong>using</strong> only<br />

<strong>music</strong> <strong>lyrics</strong> (He et al., 2008; Hu, Chen, & Yang, 2009b). He et al. (2008) compared traditional<br />

bag-of-words features in unigrams, bigrams, trigrams <strong>and</strong> their combinations, as well as three<br />

feature representation models (i.e., Boolean, absolute term frequency <strong>and</strong> tfidf weighting). Their<br />

results showed that the combination of unigram, bigram <strong>and</strong> trigram tokens with tfidf weighting<br />

performed the best, indicating that higher-order bag-of-words features captured more semantics<br />

useful for <strong>mood</strong> <strong>classification</strong>. Hu et al. (2009b) moved beyond the bag-of-words lyric features<br />

<strong>and</strong> extracted features based on an affective lexicon translated from the Affective Norms for<br />

English Words (ANEW) (Bradley & Lang, 1999). The datasets used in both studies were<br />

relatively small: the dataset in He et al. (2008) contained 1,903 songs in only two <strong>mood</strong><br />

categories, “love” <strong>and</strong> “lovelorn,” while Hu et al. (2009b) classified 500 Chinese songs into four<br />

<strong>mood</strong> categories derived from Russell’s arousal-valence model.<br />

From a different angle, Bischoff, Firan, Nejdl, <strong>and</strong> Paiu (2009a) tried to use <strong>social</strong> <strong>tags</strong> to<br />

predict <strong>mood</strong> <strong>and</strong> theme labels of popular songs. The authors designed the experiments as a tag<br />

26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!