23.03.2013 Views

Lexicon-Based Methods for Sentiment Analysis - Simon Fraser ...

Lexicon-Based Methods for Sentiment Analysis - Simon Fraser ...

Lexicon-Based Methods for Sentiment Analysis - Simon Fraser ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Computational Linguistics Volume 37, Number 2<br />

Figure 5<br />

Distribution of responses by adjective SO value <strong>for</strong> Google PMI dictionary, single-word task.<br />

Figure 6<br />

Distribution of responses by adjective SO value <strong>for</strong> Google PMI dictionary, negative word-pair<br />

task.<br />

As compared to the manually ranked SO dictionary, the Google PMI dictionary<br />

does not maximize as quickly, suggesting significant error at even fairly high SO values.<br />

Interestingly, the graph shows a striking similarity with the manually ranked dictionary<br />

in terms of the asymmetry between positive and negative words; negative words are<br />

almost never ranked as positive, although the reverse is not true. The neutral curve<br />

peaks well into the positive SO range, indicating that neutral and positive words are<br />

not well distinguished by the dictionary. 30 Overall, the SO-PMI dictionary correctly<br />

predicts 48.5% of the Mechanical Turk rankings in this task, which places it well below<br />

the manually ranked adjective dictionary (73.7%).<br />

Figure 6 shows the results <strong>for</strong> the negative adjective comparison task using the<br />

Google PMI dictionary. Here, the Google PMI dictionary per<strong>for</strong>ms fairly well, comparable<br />

to the manual rankings, though the overall MT correspondence is somewhat lower,<br />

47% to 64%. This is partially due to bunching in the middle of the scale. Recall that<br />

the highest possible MT correspondence <strong>for</strong> this task is 76.8%. MT correspondence of<br />

30 Distinguishing neutral and polar terms, sentences, or texts is, in general, a hard problem (Wilson, Wiebe,<br />

and Hwa 2004; Pang and Lee 2005).<br />

294

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!