21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Support Vector Machines (SVM), etc. (Sebastiani, 2002). In text categorization evaluation<br />

studies, Naïve Bayes <strong>and</strong> SVM are almost always considered. Naïve Bayes often serves as a<br />

baseline, while SVM seems to have the top performances (Yu, 2008). In MIR, <strong>music</strong><br />

<strong>classification</strong> studies (mostly on genre <strong>classification</strong>) often choose KNN <strong>and</strong>/or decision trees<br />

(C4.5) as baselines to be compared to SVM. Results in both existing <strong>music</strong> <strong>classification</strong><br />

experiments <strong>and</strong> MIREX <strong>classification</strong> tasks have shown that the SVM generally, if not always,<br />

outperforms other algorithms (e.g., Hu et al., 2008a; Laurier et al., 2008). As this research needs<br />

to combine both sources of <strong>audio</strong> <strong>and</strong> text, SVM is chosen as the <strong>classification</strong> algorithm.<br />

By design, SVM is a binary <strong>classification</strong> algorithm. For multi-class <strong>classification</strong> problems,<br />

a number of SVM have to be learned <strong>and</strong> each of them predicts the membership of examples to<br />

one class. In order to reduce the chance of overfitting, SVM attempts to find the <strong>classification</strong><br />

plane in between two classes <strong>and</strong> maximizes the margin to either class (Burges, 1998). The data<br />

instances on the margins are called support vectors, while other instances are considered not<br />

contributive to the <strong>classification</strong>. SVM classifies a new instance by deciding on which side of the<br />

plane the vector of the instance would fall.<br />

An SVM with a linear kernel means that there exists a straight line in a two-dimensional<br />

space that separates one class from another (Figure 4.2). For datasets that are not linearly<br />

separable, higher order kernels are used to project the data to a higher dimensional space where<br />

they become linearly separable. Finding the <strong>classification</strong> plane involves a complicated<br />

computation of quadratic programming, <strong>and</strong> thus SVM are more computationally expensive than<br />

Naïve Bayes classifiers. However, SVM are very robust with noisy examples, <strong>and</strong> they can<br />

50

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!