Views
4 years ago

Automatic Extraction of Examples for Word Sense Disambiguation

Automatic Extraction of Examples for Word Sense Disambiguation

CHAPTER 2. BASIC

CHAPTER 2. BASIC APPROACHES TO WORD SENSE DISAMBIGUATION 15 table.”, ”This truck eats a lot of fuel.” or ”We ate our savings.”, etc. Normally, however, the biggest problem with selectional preferences is their circular relation with WSD. To be able to figure out the correct semantic constraint one needs to have an idea about the senses of the target words and at the same time WSD improves considerably if a big enough set of selectional relations is available. Apart from the already discussed possibilities for knowledge-based methods there are also some heuristic approaches - most frequent sense (MFS) (see Section 3.2.2) for example. This method is set around the idea that in each case the sense of the target word that has the high- est frequency amongst all of its senses is chosen regardless of the context in which this word is considered in. Unlike the other knowledge-based methods the heuristic methods are relatively easy to implement and fast to use with data of a bigger size. It has often been noted (e.g. Mihal- cea (2007)) that the Lesk algorithm as well as the selectional preferences algorithm can become excessively computationally-exhaustive if more than few words are in the process of disambigua- tion. Moreover, the MFS heuristic is as well usually used as a baseline for most of the evaluation exercises for WSD systems (see Chapter 4). One of the biggest advantages of knowledge-based methods in respect to corpus-based methods is the fact that despite their poor performance in terms of accuracy (percentage of labeled words correctly disambiguated), they can be applied to unrestricted (not domain specific) texts and unlimited amount of target words regardless the existence of already manually annotated sense- tagged corpora. 2.2 Unsupervised Corpus-Based The nature of unsupervised WSD is considerably related to the problem of density estimation in statistics. In the process of disambiguation the aim of an unsupervised method is to discover the different patterns and structures in a predefined data source that has not been manually annotated beforehand. Exactly the possibility to work with data that does not need the extremely expensive human effort for manual annotation makes this approach so appealing. Thus there is already quite much going on in researching those paths: (Schütze, 1998), (Ramakrishnan et al., 2004), (Niu et al., 2004a), (Preiss, 2004), (Litkowski, 2004a), (Buscaldi et al., 2004), (Seo et al., 2004), (Pedersen, 2004), etc. Unsupervised corpus-based algorithms do not directly assign a sense to a target word but rather try to distinguish all possible senses of the given word based on the information they can gain from the unannotated corpora and then discriminate among those senses. Hence, the pro- cess is not dependent on the existence of already predefined sense inventories of the target words. Moreover, as a result, unsupervised corpus-based methods can provide us with sense inventories that are a lot more ”tuned” to different domains which is of a great help for different applica- tions like Machine Translation for example. There are two fundamental approaches to unsu-

CHAPTER 2. BASIC APPROACHES TO WORD SENSE DISAMBIGUATION 16 pervised corpus-based word sense disambiguation: distributional and translational equivalence approaches. Both are considered knowledge-lean approaches as a result of their dependency only on the existing but not necessarily annotated monolingual corpora or on the word-aligned parallel corpora. 2.2.1 Distributional Methods Distributional methods make more use of statistics than lexicology. They do not search through the already predefined senses of the target word and choose the one that fits best but rather cluster the words that have similar environments not taking in consideration any of the already established sense inventories for those words. As a result there are no limits to the number of created clusters which in respect denote the granularity of senses the target words acquire. Pedersen (2007) describes the distributional method as an automated and idealized view of the work lexicographers do in the attempt to find a good definition of a word. However, in this case a good definition is relative to the corpus from which it has been extracted. This means that the clusters will only correspond to the senses that are present in the corpus itself and not to the senses that the target word actually has. Apart from the discrimination among granularity of senses there is another issue that is of significant importance to distributional methods: the automatic labeling of the clusters. It has been often discussed (e.g. Pedersen (2007)) that such a task is quite difficult as well as very significant. One of the possible solutions to the problem as Pedersen (2007) shows is the extraction of separate sets of words that are related to the created clusters and their usage instead of a normal definition. Such words that capture the contextual similarity of the clusters are acquired by type based methods of discrimination. Since it is not our prior aim to describe in depth the distributional methods for unsupervised WSD, please refer to (Pedersen, 2007) for more information. However, to illustrate the process let us consider the word line. If we have two separate clusters representing the rope-like sense and the one connected with communication the type based methods will give us words that could be used as labels of the clusters like (cord, line, tie, cable) for the first rope-like cluster and (telephone, line, call) for the communication cluster. 2.2.2 Translational Equivalence Methods Translational equivalence methods, as their name suggests, have something to do with the trans- lation of the target word into another language. Thus, for this purpose parallel corpora (collec- tions of texts placed alongside their translations) become very handy. Translational equivalence methods extract the translations of the target word from the parallel corpora and so create its sense inventory. In other words the considered sense inventory of a target word is build up of all the translations of this target word in the parallel corpora. This approach is believed to be very useful for many purposes, e.g. extraction of sense inventories that are more adequate to specific

A Machine Learning Approach for Automatic Road Extraction - asprs
Selective Sampling for Example-based Word Sense Disambiguation
Word sense disambiguation with pattern learning and automatic ...
Word Sense Disambiguation Using Automatically Acquired Verbal ...
Using Machine Learning Algorithms for Word Sense Disambiguation ...
Word Sense Disambiguation The problem of WSD - PEOPLE
Performance Metrics for Word Sense Disambiguation
BOOTSTRAPPING IN WORD SENSE DISAMBIGUATION
Word Sense Disambiguation - cs547pa1
WORD SENSE DISAMBIGUATION - Leffa
word sense disambiguation and recognizing textual entailment with ...
MRD-based Word Sense Disambiguation - the Association for ...
Word Sense Disambiguation Using Selectional Restriction - Ijsrp.org
KU: Word Sense Disambiguation by Substitution - Deniz Yuret's ...
Using Lexicon Definitions and Internet to Disambiguate Word Senses
Using unsupervised word sense disambiguation to ... - INESC-ID
A Comparative Evaluation of Word Sense Disambiguation Algorithms
Semi-supervised Word Sense Disambiguation ... - ResearchGate
Word Sense Disambiguation is Fundamentally Multidimensional
Using Meaning Aspects for Word Sense Disambiguation
Towards Word Sense Disambiguation of Polish - Proceedings of the ...
Word Sense Disambiguation: An Empirical Survey - International ...
Unsupervised learning of word sense disambiguation rules ... - CLAIR
Word-Sense Disambiguation for Machine Translation
Word Sense Disambiguation with Pictures - CiteSeerX
Word Sense Disambiguation Using Association Rules: A Survey
Similarity-based Word Sense Disambiguation
Word Sense Disambiguation with Pictures - CLAIR