21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Its results reflect the best results for the tasks (Downie, 2008). The performance of a top-ranked<br />

system in the AMC tasks sets a difficult baseline against which comparisons must be made.<br />

All <strong>audio</strong>-based <strong>classification</strong> systems in MIREX as well as systems in most other <strong>music</strong><br />

<strong>classification</strong> studies applied st<strong>and</strong>ard supervised learning models such as K-Nearest Neighbor<br />

(KNN), Naïve Bayes, <strong>and</strong> Support Vector Machines (SVM). Among the learning models, SVM<br />

seems the most popular model with top performance. This dissertation uses SVM as the<br />

<strong>classification</strong> model for two reasons: 1) the selected <strong>audio</strong>-based system uses SVM; <strong>and</strong> 2) SVM<br />

achieved the best or close to the best results in both MIR <strong>and</strong> text categorization experiments in<br />

general (M<strong>and</strong>el, Poliner, & Ellis, 2006; Hu, Downie, Laurier, Bay, & Ehmann 2008a;<br />

Tzanetakis & Cook, 2002; Yu, 2008).<br />

1.3.4 Combining Lyrics <strong>and</strong> Audio<br />

In machine learning, it is acknowledged that multiple independent sources of features will<br />

likely compensate for one another, resulting in better performances than approaches <strong>using</strong> any<br />

one of the sources (Dietterich, 2000). Previous work in <strong>music</strong> <strong>classification</strong> has used such hybrid<br />

sources as <strong>audio</strong> <strong>and</strong> <strong>lyrics</strong> (e.g., Mayer et al., 2008), <strong>audio</strong> <strong>and</strong> symbolic scores (e.g., McKay &<br />

Fujinaga, 2008), etc., <strong>and</strong> has shown improved performance. Thus, one hypothesis in this<br />

dissertation is that hybrid systems combining <strong>audio</strong> <strong>and</strong> <strong>lyrics</strong> perform better than systems <strong>using</strong><br />

either source.<br />

Research Question 4: Are systems combining <strong>audio</strong> <strong>and</strong> <strong>lyrics</strong> significantly better than<br />

<strong>audio</strong>-only or lyric-only systems?<br />

9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!