21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Information science is an interdisciplinary field. It often involves topics that have been<br />

traditionally studied in other fields. Borrowing findings from literatures in other fields is a very<br />

important research method in information science, but researchers need to pay attention to<br />

connecting theories in the literature to the reality <strong>and</strong> <strong>social</strong> context of the problems under<br />

investigation.<br />

This research also proposed a method of building ground truth dataset <strong>using</strong> <strong>social</strong> <strong>tags</strong>. The<br />

method is efficient <strong>and</strong> flexible. It does not require recruiting human assessors <strong>and</strong> thus does not<br />

suffer low cross assessor consistency, the exact bottleneck of building large ground truth dataset<br />

in MIR. The method is flexible in that it can be applied to any <strong>music</strong> data available to the<br />

researcher. To date, the ground truth dataset built in this research is the largest experimental<br />

dataset with <strong>audio</strong>, <strong>lyrics</strong> <strong>and</strong> <strong>social</strong> <strong>tags</strong> for <strong>music</strong> <strong>mood</strong> <strong>classification</strong>.<br />

This research evaluated a number of lyric text features in the task of <strong>music</strong> <strong>mood</strong><br />

<strong>classification</strong>, including the basic, commonly used bag-of-words features, features based on<br />

psycholinguistic lexicons, <strong>and</strong> text stylistic features. The results revealed that the most useful<br />

lyric features were combinations of content words, certain linguistic features, <strong>and</strong> text stylistic<br />

features. A surprising finding was that the combination of ANEW scores <strong>and</strong> text stylistic<br />

features achieved the second best performance (with no significant difference from the best one)<br />

among all feature types <strong>and</strong> combinations with only 37 dimensions in this feature set (compared<br />

to 107,360 in the top performance feature set).<br />

In terms of averaged performance across categories, the lyric-only system outperformed a<br />

leading <strong>audio</strong>-only system on this task, although the performance difference was a bit shy from<br />

being statistically significant. On individual categories, the two information sources show<br />

111

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!