21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Again, these top-ranked features seem to have strong semantic connections to the categories,<br />

<strong>and</strong> they share common words with the top-ranked features listed in Table 7.5 <strong>and</strong> Table 7.7.<br />

Although both Affect-lex <strong>and</strong> GI-lex are domain-oriented lexicons built from psycholinguistic<br />

resources, they contain different words, <strong>and</strong> thus each of them identified some novel features that<br />

are not shared by the other. The category “hopeful” is positioned at the center of Figure 5.2,<br />

with small values in both valence <strong>and</strong> arousal dimension, <strong>and</strong> thus it is not surprising that the top<br />

ANEW features for “hopeful” involve both valence <strong>and</strong> arousal scores.<br />

7.4.4 Top Text Stylistic Features<br />

Text stylistic features performed the worst among all individual lyric feature types<br />

considered in this research (Table 6.4). In fact, the average accuracy of text stylistic features was<br />

significantly worse than each of the other feature types (p < 0.05). However, text stylistic<br />

features did outperform <strong>audio</strong> features in two categories: “hopeful” <strong>and</strong> “exciting.” Table 7.9<br />

shows the top-ranked stylistic features (defined in Table 6.2) in these two categories.<br />

Table 7.9 Top-ranked text stylistic features for categories where text stylistics significantly<br />

outperformed <strong>audio</strong><br />

hopeful<br />

stdLineLength<br />

uniqWordsPerLine<br />

avgWordLength<br />

repeatLineRatio<br />

avgLineLength<br />

repeatWordRatio<br />

numberOfUniqLines<br />

exciting<br />

uniqWordsPerLine<br />

avgRepeatWordRatioPerLine<br />

stdLineLength<br />

repeatWordRatio<br />

repeatLineRatio<br />

avgLineLength<br />

numberOfBlankLines<br />

Note how the top-ranked features in Table 7.9 are all text statistics without interjection<br />

words or punctuation marks. Also noteworthy is that these two categories both have relatively<br />

low positive valence values (but opposite arousal) as shown in Figure 5.2.<br />

102

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!