21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

in a study on <strong>music</strong> genre <strong>classification</strong> (Mayer et al., 2008). Besides, combinations of these<br />

feature types are also evaluated in this study <strong>and</strong> are described in this section as well.<br />

6.2.1 Basic Lyric Features<br />

As a starting point, this research evaluates bag-of-words features with the following types:<br />

1) Content words (Content): all words except function words, without stemming;<br />

2) Content words with stemming (Cont-stem): stemming means combining words with<br />

the same roots;<br />

3) Part-of-speech (POS) <strong>tags</strong>: such as noun, verb, proper noun, etc. In this research, the<br />

Stanford POS tagger 11 is used to tag each lyric word with one of the 36 unique POS<br />

<strong>tags</strong> in the Penn Treebank project 12 ;<br />

4) Function words (FW): as opposed to content words, also called “stopwords” in text<br />

information retrieval. The function word list used in this study is the one compiled by<br />

S. Argamon, a well-known scholar in the area of text stylistic analysis 13 .<br />

For each of the feature types, four representation models are compared: 1) Boolean; 2) term<br />

frequency; 3) normalized frequency; <strong>and</strong>, 4) tfidf weighting. In a Boolean representation model,<br />

each feature value is term presence or absence (one or zero). The term frequency <strong>and</strong> normalized<br />

frequency models, as their names indicate, use term frequencies <strong>and</strong> normalized term frequencies<br />

11 http://nlp.stanford.edu/software/tagger.shtml<br />

12 http://www.cis.upenn.edu/~treebank/<br />

13 The function word list can be accessed at http://www.ir.iit.edu/~argamon/function-words.txt<br />

71

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!