21.01.2014 Views

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

improving music mood classification using lyrics, audio and social tags

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

mixed with regard to its usefulness. This dissertation research strives to find out whether <strong>and</strong><br />

how the ANEW scores can help classify text sentiment in the <strong>lyrics</strong> domain.<br />

Besides scores in the three dimensions, for each word ANEW also provides the st<strong>and</strong>ard<br />

deviation of the scores in each dimension given by the human subjects. Therefore there are six<br />

values associated with each word in ANEW. For the <strong>lyrics</strong> of each song, means <strong>and</strong> st<strong>and</strong>ard<br />

deviations for each of these values are calculated for words included in ANEW, which results in<br />

12 features.<br />

As the number of words in the original ANEW is probably too few to have at least one word<br />

included in each of the songs in the experiment dataset, the ANEW word list is exp<strong>and</strong>ed <strong>using</strong><br />

WordNet. WordNet, as mentioned before, is an English lexicon with marked linguistic<br />

relationships among word senses. It is organized by synsets such that word senses in one synset<br />

are essentially synonyms. Hence, ANEW is exp<strong>and</strong>ed by including all words in WordNet that<br />

share the same synset with a word in ANEW <strong>and</strong> giving these words the same ANEW scores as<br />

the one in ANEW. Again, word senses are not differentiated since ANEW only presents word<br />

forms without specifying which sense is used. After expansion, there are 6,732 words in the<br />

exp<strong>and</strong>ed ANEW which covers all songs in the experiment dataset. That is, every song has nonzero<br />

values in the 12 dimensions. This feature type is denoted as “ANEW.”<br />

Like the words from General Inquirer, the 6,732 words in the exp<strong>and</strong>ed ANEW can be seen<br />

as a lexicon of affect-related words. Together with the 1,586 unique words in the latest version of<br />

WordNet-Affect, the exp<strong>and</strong>ed ANEW forms an affect lexicon of 7,756 unique words. This set<br />

of words are used to build bag-of-words features under the aforementioned four representation<br />

models. This feature type is denoted as “Affect-lex.”<br />

76

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!