21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

There are several operational systems focused on text categorization according to affect.<br />

Subasic <strong>and</strong> Huettner (2001) manually constructed a word lexicon for each affect category<br />

considered in their study, <strong>and</strong> classified documents by comparing the average scores of terms in<br />

the affect categories. A more complex approach was taken by Liu, Lieberman, <strong>and</strong> Selker (2003)<br />

which was based on common sense knowledge, due to the assumption that common sense is<br />

important for affect interpretation.<br />

As <strong>lyrics</strong> are a special genre quite different from daily life documents, a common sense<br />

knowledge base may not work for <strong>lyrics</strong>; neither do word lexicons built for other genres of<br />

documents. While manually building a lexicon is very labor-intensive, methods on automatic<br />

lexicon induction have been proposed. Pang <strong>and</strong> Lee (2008) summarized such methods <strong>and</strong><br />

categorized them into two groups: unsupervised <strong>and</strong> supervised. The three feature selection<br />

methods applied to <strong>lyrics</strong> in Hu et al. (2009a) (e.g., language model comparison, F-score feature<br />

ranking, <strong>and</strong> SVM feature ranking) are vivid examples of supervised lexicon induction.<br />

Unsupervised methods start from a few seed words for which the affect is already known, <strong>and</strong><br />

then propagate the labels of the seed words to words that co-occur with them in a text corpus, to<br />

synonyms, <strong>and</strong>/or to words that co-occur with them in other resources like WordNet. For<br />

instance, Turney (2002) proposed the joint use of mutual information <strong>and</strong> co-occurrence in a<br />

general corpus with a small set of seed words.<br />

There have been interesting studies on the affective aspect of text in the context of weblogs<br />

(Nicolov, Salvetti, Liberman, & Martin, 2006). Most of them still used bag-of-words features of<br />

all words or words in specific POSs (mostly adjectives <strong>and</strong> nouns). Among them, Mihalcea <strong>and</strong><br />

69

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!