5 years ago

Automatic Extraction of Examples for Word Sense Disambiguation

Automatic Extraction of Examples for Word Sense Disambiguation


CHAPTER 2. BASIC APPROACHES TO WORD SENSE DISAMBIGUATION 23 Swedish corpus - 179 151 instances with the Gothenburg lexical database as inventory and the SUC Corpus as its corpus. Image captions - 2 304 words from image captions of an image collection and WordNet as source inventory. All this shows how immense the efforts to solve this exceptionally important problem (the knowledge acquisition bottleneck) are and how despite this considerable attempt the biggest obstacle for supervised WSD is still on the way. It will surely employ further on a great deal of commission but until then we could continue discussing the process of WSD with a supervised system and the next step which should be taken from deciding on a sense inventory and having semantically annotated corpus is the data preprocessing. 2.3.3 Data Preprocessing Having chosen the corpora and extracted the needed parts of it, however, does not already mean that we are ready to go ahead and train a classifier. There is another intermediate step which is not necessarily difficult but in most cases quite complex. It consists of data preprocessing and preparation for the extraction of the training and test examples. The former is believed to have a significant impact on the performance of supervised WSD systems. It conditions the quality of the discovered patterns and makes it possible to find useful information from initial data. The most basic preparation of the data is the filtering of any unwanted special or meta characters, which are normally existent in large corpora or the removal of exceptional punctuation marks. Such ”noise” in the data does not only lead to decrease in accuracy, but depending on the further processing and the software that is used it might even lead to other problems (e.g. part-of-speech (POS) tagger (usually used further in the process) not handling special punctuation characters). The second part in this process is POS tagging. Depending on the decision that is made about the information which needs to be contained in the examples, multi-words (a multi-word expression is a lexeme build up of a sequence of at least two lexemes that has properties that are not predictable from the properties of the individual lexemes), lexical dependencies and n-gram (sub-sequence of n items, in our case words, from a given sequence) detection might be necessary as well. When the above mentioned basic preprocessing and POS tagging are done, the data can be finally used for extraction of examples needed for training and testing of the classifier (also called word-expert). This step is further described in the following section. 2.3.4 Feature Vectors The examples in the training set are constructed as feature vectors (FV). Feature vectors are fixed-length n-dimensional vectors of features (where n is the number of selected attributes/features)

CHAPTER 2. BASIC APPROACHES TO WORD SENSE DISAMBIGUATION 24 that represent the relevant information about the example they refer to. Those attributes are predefined and picked up uniformly for each instance so that each vector has the same number and type of features. There are three types of features from which the selection can be made - local features, global features and syntactic dependencies. Local features describe the narrow context around the target word - bigrams, trigrams (other n-grams) that are generally consti- tuted by lemmas, POS tags, word-forms together with their positions with respect to the target word. Global features on the other hand consider not only the local context around the word that is being disambiguated, but the whole context (whole sentences, paragraphs, documents) aiming to describe best the semantic domain in which the target word is found. The third type of features that can be extracted are the syntactic dependencies. They are normally extracted at a sentence level using heuristic patterns and regular expressions in order to gain a better representation for the syntactic cues and argument-head relations such as subject, object, noun- modifier, preposition etc. Commonly, features are selected out of a predefined pool of features (PF), which consists out of the most relevant features for the approached task. We compiled such PF (see Appendix B) being mostly influenced for our choice by (Dinu and Kübler, 2007) and (Mihalcea, 2002). Let us consider a simple example. Given the sentences in (2) we will attempt to create FVs that represent the occurrences of the noun bank in those sentences. (2) 1. ”He sat on the bank of the river and watched the currents.” 2. ”He operated a bank of switches.” For the sake of simplicity at this point we chose a minimum number of features from the pre- defined pool of features. The corresponding set consists only of local features (cw−2, cw−1, cw0, cw+1, cw+2, cp−2, cp−1, cp0, cp+1, cp+2). The resulting vectors thus can be seen in Table 2.4 below. Since we do not explain in detail exactly how those vectors are put together, please refer to Chapter 6 where we follow stepwise the process of FV construction. # cw−2 cw−1 cw0 cw+1 cw+2 cp−2 cp−1 cp0 cp+1 cp+2 class 1. on the bank of the P DET N P DET bank%1:17:01:: 2. operated a bank of switches V DET N P N bank%1:14:01:: Table 2.4: Feature vectors for the sentences in (2). As we have already noted, supervised learning is based on the training data that has been manually annotated beforehand while unsupervised learning does not need such predefined sense inventory. Thus, a very important part of the FV is the class label that represents the classification of this vector (this is the value of the output space that is associated with a train- ing example) which in our case is to be found in the class column. As its value we have selected the correct sense of the noun bank in the sentence represented by the sense key that this sense

A Machine Learning Approach for Automatic Road Extraction - asprs
Selective Sampling for Example-based Word Sense Disambiguation
Word sense disambiguation with pattern learning and automatic ...
Word Sense Disambiguation Using Automatically Acquired Verbal ...
Using Machine Learning Algorithms for Word Sense Disambiguation ...
Word Sense Disambiguation The problem of WSD - PEOPLE
Performance Metrics for Word Sense Disambiguation
Word Sense Disambiguation - cs547pa1
word sense disambiguation and recognizing textual entailment with ...
MRD-based Word Sense Disambiguation - the Association for ...
Word Sense Disambiguation Using Selectional Restriction -
Using unsupervised word sense disambiguation to ... - INESC-ID
Using Lexicon Definitions and Internet to Disambiguate Word Senses
Word Sense Disambiguation is Fundamentally Multidimensional
A Comparative Evaluation of Word Sense Disambiguation Algorithms
KU: Word Sense Disambiguation by Substitution - Deniz Yuret's ...
Semi-supervised Word Sense Disambiguation ... - ResearchGate
Using Meaning Aspects for Word Sense Disambiguation
Word Sense Disambiguation: An Empirical Survey - International ...
Word-Sense Disambiguation for Machine Translation
Towards Word Sense Disambiguation of Polish - Proceedings of the ...
Word Sense Disambiguation Using Association Rules: A Survey
Unsupervised learning of word sense disambiguation rules ... - CLAIR
Similarity-based Word Sense Disambiguation
Word Sense Disambiguation with Pictures - CLAIR
Word Sense Disambiguation with Pictures - CiteSeerX