Views
3 years ago

BOOTSTRAPPING IN WORD SENSE DISAMBIGUATION

BOOTSTRAPPING IN WORD SENSE DISAMBIGUATION

BOOTSTRAPPING IN WORD SENSE

BOOTSTRAPPING IN WORD SENSE DISAMBIGUATION DOINA TĂTAR Abstract. The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word: Allen (1995), Manning, Schutze (1999), Tătar (2005). In this paper we present one algorithm (Tătar, Şerban 2001) which combines Yarowsky’s principles (Yarowsky 1999) (one sense per discourse and one sense per collocation) and the Naive Bayes Classifier method. 1. INTRODUCTION The word sense disambiguation (WSD) is probably one of the most important open problem and it has now already a long “history” in computational linguistics. WSD is the process of identifying the correct meanings of words in particular contexts. More exactly, the problem that arise in a natural language is that many words present the phenomenon of polysemy that means they have several meanings or senses. That sense depends on what context they occur. The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word. Whenever a system’s actions depend on the meaning of the text being processed, WSD is necessary. WSD has direct applications in some fields of text understanding as information retrieval, text summarization, machine translation (Orăşan et al. 2003). It is only an intermediate task in Natural Language Processing (NLP), like POS tagging or parsing. The algorithms used in WSD are classified to whether they involve supervised or unsupervised learning. Unsupervised learning can be viewed as clustering task while supervised learning is usually seen as a classification task. Dictionary based disambiguation, can be considered as intermediary between supervised and unsupervised disambiguation. The algorithm presented in this paper is a bootstrapping one, that means it repeatedly convert training corpus and test corpus, during some specific actions. We also remember in this paper the k-NN algorithm, as used in one international on-line contest on WSD (Şerban, Tătar 2004). Notational conventions used in the following are (Manning, Schutze 1999, Tătar, Şerban 2001, Tătar 2003): w-- the word to be disambiguated (target word); s1, … ,sN1 --- possible senses for w; c1, … ,cN2 --- contexts of w in corpus; v1, … ,vN3 --- words used as contextual features for disambiguation of w. RRL, LI, 3–4, p. 415–420, Bucureşti, 2006

Towards clustering-based word sense discrimination - Lugos
Word Sense Disambiguation - cs547pa1
Word sense disambiguation with pattern learning and automatic ...
Soft Word Sense Disambiguation
Subjectivity Word Sense Disambiguation - Department of Computer ...
Towards Word Sense Disambiguation of Polish - Proceedings of the ...
Word Sense Disambiguation The problem of WSD - PEOPLE
Similarity-based Word Sense Disambiguation
Lexical Semantics & Word Sense Disambiguation Lexical Semantics ...
Combining classifiers for word sense disambiguation
Knowledge-based biomedical word sense disambiguation ...
Using unsupervised word sense disambiguation to ... - INESC-ID
Using Machine Learning Algorithms for Word Sense Disambiguation ...
Word Sense Disambiguation Using Label Propagation ... - WING
Chapter 1 Word Sense Disambiguation: Literature Survey ... - cfilt
Word Sense Disambiguation with a Corpus-based Semantic Network
Word Sense Disambiguation Using Selectional Restriction - Ijsrp.org
Towards a hybrid approach to word-sense disambiguation in ...
Nominal Taxonomies and Word Sense Disambiguation - Machine ...
The Interaction of Knowledge Sources in Word Sense Disambiguation
Word Sense Disambiguation: An Empirical Survey - International ...
Word-Sense Disambiguation for Machine Translation
Class-based Collocations for Word-Sense Disambiguation
Unsupervised Word Sense Disambiguation - cfilt - Indian Institute of ...
Word Sense Disambiguation Using Association Rules: A Survey
Using WordNet for Word Sense Disambiguation to Support Concept ...
Using LazyBoosting for Word Sense Disambiguation - TALP - UPC
Performance Metrics for Word Sense Disambiguation