21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

different strengths. Lyric-based systems seem to have an advantage on categories where words in<br />

<strong>lyrics</strong> have good connection to the categories such as “angry” <strong>and</strong> “romantic.” However, future<br />

work is needed to make a conclusive claim.<br />

In combining <strong>lyrics</strong> <strong>and</strong> <strong>music</strong> <strong>audio</strong>, late fusion (linear interpolation with equal weights to<br />

both classifiers) yielded the best performance, <strong>and</strong> its performance was more stable across <strong>mood</strong><br />

categories than the other hybrid method, feature concatenation. Both hybrid systems significantly<br />

outperformed (at p < 0.05) the <strong>audio</strong>-only system which was a top ranked system on this task.<br />

The late fusion system improved the performance of the <strong>audio</strong>-only system by 9.6%,<br />

demonstrating the effectiveness of combining <strong>lyrics</strong> <strong>and</strong> <strong>audio</strong>.<br />

Experiments on learning curves discovered that complementing <strong>audio</strong> with <strong>lyrics</strong> could<br />

reduce the number of training examples required to achieve the same performance level as<br />

single-source-based systems. The <strong>audio</strong>-only system appeared to have reached its potential <strong>and</strong><br />

stops <strong>improving</strong> performance when given 80% of all training examples. In contrast, the hybrid<br />

systems could continue to improve performances if more training examples become available.<br />

Combining <strong>lyrics</strong> <strong>and</strong> <strong>audio</strong> can also reduce the dem<strong>and</strong> on the length of <strong>audio</strong> used by the<br />

classifier. Very short <strong>audio</strong> clips (as short as 5 seconds), when combined with complete <strong>lyrics</strong>,<br />

outperformed single-source-based systems <strong>using</strong> all available <strong>audio</strong> or <strong>lyrics</strong>.<br />

In summary, this research identified <strong>music</strong> <strong>mood</strong> categories that reflect the reality of the<br />

<strong>music</strong> listening environment, made advances in lyric affect analysis, improved the effectiveness<br />

<strong>and</strong> efficiency of automatic <strong>music</strong> <strong>mood</strong> <strong>classification</strong>, <strong>and</strong> thus helped make <strong>mood</strong> a practical<br />

metadata type of <strong>music</strong> <strong>and</strong> access point in <strong>music</strong> repositories.<br />

112

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!