5 years ago

Automatic Extraction of Examples for Word Sense Disambiguation

Automatic Extraction of Examples for Word Sense Disambiguation


CHAPTER 4. EVALUATION OF WSD SYSTEMS 37 measure is precision. It is the ratio of the correct answers given by the system to the number of all given answers: precision = |correct answer ∩ given answer| |given answer| Recall is computed by the ratio of the correct answers given by the system to the number of all correct answers. recall = |correct answer ∩ given answer| |correct answer| In order to capture precision and recall together, their harmonic mean, also called F-score (which represents the total system performance) is used: F = 2 · (precision · recall) precision + recall 4.2 International Evaluation Exercise Senseval The main purpose of the open, community-based evaluation exercise - Senseval - is to assess the existing approaches for word sense disambiguation and to coordinate and convey different evaluation competitions and accompanying activities. Its aim is to measure the competence of the WSD systems with regard to various target words, various languages and various aspects of language and as well to advance the comprehension of lexical semantics and polysemy. 4.3 Senseval-1 The first of the evaluation competitions of Senseval, Senseval-1 2 (Kilgarriff, 1998; Kilgarriff and Palmer, 2000; Kilgarriff and Rosenzweig, 2000) was conducted in 1998 with an associated workshop held at Herstmonceux Castle, Sussex, England. As Palmer et al. (2007) report, it has been designed only for English although a parallel task (Romanseval 3 (Segond, 2000; Calzolari and Corazzari, 2000)) has been run for Italian and French. The Hector lexicon (see Section 2.3.2 or (Atkins, 1993)) was used as sense inventory for the task as thirty-four randomly selected words under particular criteria (POS - noun, verb, adjective; number of senses, frequency) were targeted. The data was manually annotated by professional lexicographers with a quite solid IAA of about 80%. For Senseval-1 the lexical sample task was chosen against the all-words task. In the all-words task the competing systems would need to target all the words in the set of data (i.e. all open- class words in the corpora) whereas in the lexical sample task a single sample of the words is 2 3 (4.3) (4.4) (4.5)

CHAPTER 4. EVALUATION OF WSD SYSTEMS 38 selected and corpus examples for the targeted words are extracted. Kilgarriff and Rosenzweig (2000) reason the choice of the lexical sample task for the competition with several arguments: first the fact that more efficient human tagging can be achieved for it; second, the unavailability of a free full dictionary (necessary for the all-words task); third, the impossibility to incorporate all-words task in some of the systems (the need for either sense tagged data or manual input for each dictionary entry). In total there were 8 448 training instances distributed among 15 nouns, 13 verbs, 8 adjec- tives and 5 indeterminates. An overview of the figures can be seen in Table 4.3. 4.4 Senseval-2 Senseval -1 Date 1998 Lexical Sample Task English, French, Italian Corpus Hector (Atkins, 1993) Nouns 15 Verbs 13 Adjectives 8 Indeterminates 5 Training Instances 8 448 Participating Systems 24 Table 4.1: Summary of the Senseval-1 evaluation task. The results from Senseval-1 proved that an evaluation exercise of this kind is extremely helpful in training WSD systems. Thus, the decision to continue the open, community-based competition with a new task, namely Senseval-2 4 was realized in 2001 and was concluded by the SENSEVAL- 2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, held in Toulouse, France and, as well, the workshop Word Sense Disambiguation: Recent Successes and Future Directions held in 2002, Philadelphia, USA. In this edition, it was not only the lexical sample task that the systems could participate in but the all-words task was included as well. For the sake of clarity we will look at those tasks separately further on but since our interest for the whole Senseval-2 competition is more from historical reasons we will not present any result from the tasks. Such can be found directly on the web page 5 provided by Senseval. The All-Words Task was developed for four languages: Czech, Dutch, English and Estonian and it is specific with the fact that the systems have to disambiguate all different ambiguous words in the text. 4 5 graphs.htm

A Machine Learning Approach for Automatic Road Extraction - asprs
Selective Sampling for Example-based Word Sense Disambiguation
Word sense disambiguation with pattern learning and automatic ...
Word Sense Disambiguation Using Automatically Acquired Verbal ...
Using Machine Learning Algorithms for Word Sense Disambiguation ...
Word Sense Disambiguation The problem of WSD - PEOPLE
Performance Metrics for Word Sense Disambiguation
Word Sense Disambiguation - cs547pa1
word sense disambiguation and recognizing textual entailment with ...
MRD-based Word Sense Disambiguation - the Association for ...
Word Sense Disambiguation Using Selectional Restriction -
Using unsupervised word sense disambiguation to ... - INESC-ID
Using Lexicon Definitions and Internet to Disambiguate Word Senses
Word Sense Disambiguation is Fundamentally Multidimensional
A Comparative Evaluation of Word Sense Disambiguation Algorithms
KU: Word Sense Disambiguation by Substitution - Deniz Yuret's ...
Semi-supervised Word Sense Disambiguation ... - ResearchGate
Using Meaning Aspects for Word Sense Disambiguation
Word Sense Disambiguation: An Empirical Survey - International ...
Word-Sense Disambiguation for Machine Translation
Towards Word Sense Disambiguation of Polish - Proceedings of the ...
Word Sense Disambiguation Using Association Rules: A Survey
Unsupervised learning of word sense disambiguation rules ... - CLAIR
Similarity-based Word Sense Disambiguation
Word Sense Disambiguation with Pictures - CLAIR
Word Sense Disambiguation with Pictures - CiteSeerX