Automatic Extraction of Examples for Word Sense Disambiguation
CHAPTER 6. AUTOMATIC EXTRACTION OF EXAMPLES FOR WSD 69 word. Be that as it may, looking closely at Tables 7.1 and 7.4 we can note a variation of up to 4% as a result of those very few added examples. This means that even with very small number of instances, which are carefully selected out of a huge pool of examples we gain an exceptionally big variation and in some cases separate word-experts improve with up to 4% of accuracy. feature selection coarse P R fine P R all features 67.0 67.0 62.3 62.3 forward 76.9 76.9 72.4 72.4 backward 73.0 73.0 68.9 68.9 MFS 64.5 64.5 55.2 55.2 best performance 78.5 78.5 73.9 73.9 Table 6.13: System performance on automatically annotated data without optimization of the k parameter and distance and distribution filtering mixed with the manually annotated training set. feature selection coarse P R fine P R all features 70.5 70.5 64.9 64.9 forward 78.7 78.7 74.0 74.0 backward 77.6 77.6 73.0 73.0 MFS 64.5 64.5 55.2 55.2 best performance 79.4 79.4 75.0 75.0 Table 6.14: System performance on automatically annotated data with optimization of the k parameter and distance and distribution filtering mixed with the manually annotated training set. Of course, extending this training set further on manually will result to a lot of very expensive human work. Our approach, however, can be used for this purpose. As we saw, from our original automatically collected training set of about 170 000 instance we managed to extract only 141 examples that are at the shortest similarity distance which bring improvement of the accuracy. Moreover, those 141 examples, even if so similar to the ones we already have always bring new information along. Consequently, this new information makes the classifier less sensitive to new examples. As a result we can conclude that adding automatically acquired examples to the manually prepared data set helps and that our approach reached such good results via an easy and very computationally ”cheap” method for gaining new instances for training of WSD systems. 6.8.4 Discussion The experiments we conducted and described in the previous few sections show that our system performs competitive to other state-of-art systems. The data that we used, however, has several
CHAPTER 6. AUTOMATIC EXTRACTION OF EXAMPLES FOR WSD 70 advantages but also some disadvantages, which can be clearly seen in the results of the experi- ments and which we will shortly discuss in this section. Since we used automatically prepared data together with the manually annotated one, we had the chance to observe the following issues: The manually annotated data - disadvantages: • The preparation of the data implies an extremely laborious, costly and expensive/intensive process of human annotation, which can hardly be accomplished in a very little time. • The data is of a significantly small amount, which is the effect of its costly preparation. • As a result of the manual annotation, the nature of the training data is optimally adapted to the nature of the test data. – proportional distribution of occurring senses – very similar length of the context – cover only a subset of the actual senses of the word • The above mentioned issue leaves hardly any possibility for improvement, except by a carefully and precisely filtered external data. • The preparation of such data implies also predefined semantic resources, which: – are not available in all languages – differ considerably in the quality – provide granularity of senses often different than the one needed for the task The manually annotated data - advantages: • As a result of its careful preparation the manually annotated data leads to exceptionally good system performance, outperforming unsupervised and semi-supervised methods. • Employs only well selected senses that are predefined in the training data. • Has an extremely homogeneous nature, which ensures good performance of the system. The automatically annotated data - disadvantages: • Used without any additional processing could lead to poor performance of the WSD system. • The derived senses for the target word are specific for the corpus they are extracted from. • Is never as homogeneous as manually prepared data and thus can lead to poorer results.
SEMINAR FÜR SPRACHWISSENSCHAFT Aut
Abstract In the following thesis we
Contents 1 Introduction 10 2 Basic
CONTENTS 6 6.8.4 Discussion . . . .
LIST OF TABLES 8 6.8 System perform
Chapter 1 Introduction Ambiguity is
Chapter 2 Basic Approaches to Word
CHAPTER 2. BASIC APPROACHES TO WORD
CHAPTER 2. BASIC APPROACHES TO WORD