21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

compare the differences between <strong>lyrics</strong> <strong>and</strong> text in other genres, an initial examination of the<br />

lyric text suggests that the repetitions frequently used in <strong>lyrics</strong> indeed make a difference in<br />

stemming. For example, the lines, “bounce bounce bounce” <strong>and</strong> “but just bounce bounce bounce,<br />

yeah” were stemmed to “bounc bounc bounc” <strong>and</strong> “but just bounc bounc bounce, yeah.” The<br />

original bigram “bounce bounce” then exp<strong>and</strong>ed into two bigrams after stemming: “bounc<br />

bounc” <strong>and</strong> “bounc bounce” while the original trigram “bounce bounce bounce” also became<br />

two trigrams after stemming: “bounc bounc bounc” <strong>and</strong> “bounc bounc bounce.”<br />

Table 6.1 Summary of basic lyric features<br />

Feature Type n-grams No. of dimensions<br />

unigrams 7,227<br />

Content words without stemming (Content)<br />

bigrams 34,133<br />

trigrams 42,795<br />

uni+bigrams 41,360<br />

uni+bi+trigrams 84,155<br />

unigrams 6,098<br />

Content words with stemming (Cont-stem)<br />

bigrams 33,008<br />

trigrams 42,707<br />

uni+bigrams 39,106<br />

uni+bi+trigrams 81,813<br />

unigrams 36<br />

Part-of-speech (POS)<br />

bigrams 1,057<br />

trigrams 8,474<br />

uni+bigrams 1,093<br />

uni+bi+trigrams 9,567<br />

unigrams 467<br />

Function words (FW)<br />

bigrams 6,474<br />

trigrams 8,289<br />

uni+bigrams 6,941<br />

uni+bi+trigrams 15,230<br />

73

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!