Lexicon-Based Methods for Sentiment Analysis - Simon Fraser ...

More documents

Recommendations

Info

Computational Linguistics Volume 37, Number 2 Figure 5 Distribution of responses by adjective SO value for Google PMI dictionary, single-word task. Figure 6 Distribution of responses by adjective SO value for Google PMI dictionary, negative word-pair task. As compared to the manually ranked SO dictionary, the Google PMI dictionary does not maximize as quickly, suggesting significant error at even fairly high SO values. Interestingly, the graph shows a striking similarity with the manually ranked dictionary in terms of the asymmetry between positive and negative words; negative words are almost never ranked as positive, although the reverse is not true. The neutral curve peaks well into the positive SO range, indicating that neutral and positive words are not well distinguished by the dictionary. 30 Overall, the SO-PMI dictionary correctly predicts 48.5% of the Mechanical Turk rankings in this task, which places it well below the manually ranked adjective dictionary (73.7%). Figure 6 shows the results for the negative adjective comparison task using the Google PMI dictionary. Here, the Google PMI dictionary performs fairly well, comparable to the manual rankings, though the overall MT correspondence is somewhat lower, 47% to 64%. This is partially due to bunching in the middle of the scale. Recall that the highest possible MT correspondence for this task is 76.8%. MT correspondence of 30 Distinguishing neutral and polar terms, sentences, or texts is, in general, a hard problem (Wilson, Wiebe, and Hwa 2004; Pang and Lee 2005). 294
Taboada et al. Lexicon-Based Methods for Sentiment Analysis nearly 55% is possible if the number of buckets is increased significantly, an effect which is due at least partially to the fact that the same designation is so underused that it is generally preferable to always guess that one of the adjectives is stronger than the other. Along with the results in the previous figure, this suggests that this method actually performs fairly well at distinguishing the strength of negative adjectives; the problem with automated methods in general seems to be that they have difficulty properly distinguishing neutral and positive terms. Our next comparison is with the Subjectivity dictionary of Wilson, Wiebe, and Hoffmann (2005). Words are rated for polarity (positive or negative) and strength (weak or strong), meaning that their scale is much more coarse-grained than ours. The dictionary is derived from both manual and automatic sources. It is fairly comprehensive (over 8,000 entries), so we assume that any word not mentioned in the dictionary is neutral. Figure 7 shows the result for the single word task. The curves are comparable to those in Figure 1; the neutral peak is significantly lower, however, and the positive and negative curves do not reach their maximum. This is exactly what we would expect if words of varying strength are being collapsed into a single category. The overall MT Correspondence, however, is comparable (71.8%). The negative adjective pair comparison task (shown in Figure 8) provides further evidence for this (Strong/Weak means a weak negative word compared with a strong negative word). The MT correspondence is only 48.7% in this task. There is a clear preference for the predicted judgment in weak/strong comparisons, although the distinction is far from unequivocal, and the overall change in neutrality across the options is minimal. This may be partially attributed to the fact that the strong/weak designation for this dictionary is defined in terms of whether the word strongly or weakly indicates subjectivity, not whether the term itself is strong or weak (a subtle distinction). However, the results suggest that the scale is too coarse to capture the full range of semantic orientation. Another publicly available corpus is SentiWordNet (Esuli and Sebastiani 2006; Baccianella, Esuli, and Sebastiani 2010), an extension of WordNet (Fellbaum 1998) where each synset is annotated with labels indicating how objective, positive, and negative the terms in the synset are. We use the average across senses for each word given in version 3.0 (see discussion in the next section). Figure 9 gives the result for the Figure 7 Distribution of responses by adjective SO value for Subjectivity dictionary, single-word task. 295
Page 1 and 2: Lexicon-Based Methods for Sentiment
Page 3 and 4: Taboada et al. Lexicon-Based Method
Page 27: Taboada et al. Lexicon-Based Method
Page 41: Taboada et al. Lexicon-Based Method

Lexicon-Based Methods for Sentiment Analysis - Simon Fraser ...

Create successful ePaper yourself

Delete template?

Save as template?