21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.3 IMPLEMENTATION<br />

The Snowball stemmer 14 is used for experiments that require stemming. As this stemmer<br />

cannot h<strong>and</strong>le irregular words, it is supplemented with irregular nouns <strong>and</strong> verbs 15 .<br />

The Stanford POS tagger implements two tagging models: one uses the preceding three <strong>tags</strong><br />

as tagging context, the other considers both preceding <strong>and</strong> following <strong>tags</strong> (Toutanova, Klein,<br />

Manning, & Singer, 2003). The bidirectional model performs slightly better than the left sideonly<br />

model, but is significantly slower. As the lyric dataset used in this research is large, the<br />

more efficient left side-only model is adopted. The Stanford tagger is trained on a corpus<br />

consisting of articles in the Wall Street Journal. News articles are in a different text genre from<br />

<strong>lyrics</strong>, but there is no available training corpus of <strong>lyrics</strong> with annotated POS <strong>tags</strong>. Nevertheless,<br />

the lyric data are also in modern English, <strong>and</strong> the combinations of POS <strong>tags</strong> in <strong>lyrics</strong> are not<br />

much different from news articles. The author has manually examined about 50 tagged <strong>lyrics</strong> <strong>and</strong><br />

the results are generally correct.<br />

6.4 RESULTS AND ANALYSIS<br />

6.4.1 Best Individual Lyric Feature Types<br />

For the basic lyric features summarized in Table 6.1, the variations of uni+bi+trigrams in the<br />

Boolean representation worked best for all three feature types (content words, part-of-speech,<br />

<strong>and</strong> function words). Stemming did not make a significant difference on the performances of<br />

14 http://snowball.tartarus.org/<br />

15 The irregular verb list was obtained from http://www.englishpage.com/irregularverbs/irregularverbs.html, <strong>and</strong> the<br />

irregular noun list was obtained from http://www.esldesk.com/esl-quizzes/irregular-nouns/irregular-nouns.htm<br />

80

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!