CHAPTER 4. EVALUATION OF WSD SYSTEMS 43 designed as a follower ofSenseval - Semeval. The change of the name of the task was the re- sult of the attempt to extend its spectrum to all aspects of computational semantic analysis of language. Consequently, the scope of the first task of Semeval - Semeval-1 (Agirre et al., 2007) has been constrained to a particular application of semantic relation classification, relational search. Semeval-1 included 18 different tasks targeting the evaluation of systems for the se- mantic analysis of text. The approached relations were between nominals (e.g. nouns and base noun phrases, excluding named entities). The used data consisted of manually annotated sen- tences. Semeval-1 took place in 2007, followed by a workshop held in conjunction with ACL in Prague - (Agirre et al., 2007). 4.7 Summary After we have discussed the problem of comparability of word sense disambiguation systems in Chapter 3, in the current chapter we gave an overview of the ways in which those systems could be uniformly evaluated and as well the variety and coverage of the developed approaches. The evaluation exercises conducted in the Senseval enterprise definitely show that the re- cent WSD systems are able to achieve considerably good accuracy levels that for some tasks become even comparable with human performance. The rising with each Senseval task diver- gence proved as well that the system performance remains as well relatively consistent over a variety of word types, frequencies and sense distributions. As Palmer et al. (2007) discuss there are still many open problems connected with the eval- uation of WSD systems. One such problem is the choice of sense inventory, which we already saw in the results above lead to noticeably inconsistent performance of humans and automatic systems. A very important open question is the impact of more training data for high polysemy verbs.
Chapter 5 TiMBL: Tilburg Memory-Based Learner In Section 2.3.5 of our work we reviewed the supervised methods for word sense disambiguation and payed most attention to the similarity-based methods also called memory-based methods. Since we have chosen to use a method of this family we needed as well a software that can facilitate the usage of such methods. Our choice is the Tilburg Memory-Based Learner (TiMBL). 5.1 Overview TiMBL 1 was the outcome of combining ideas from various different MBL approaches. It is a fast and discrete decision-tree-based implementation of the k-nearest neighbor classification al- gorithm described in more detail in (Daelemans et al., 2007). As a result it has become one of the very important and useful natural language processing tools for multiple alternative domains. This comes from the fact that TiMBL, following the principles of machine-based learning is cre- ated around the belief that intelligent behavior could be achieved by analogical reasoning and not by the use of abstract mental rules. This is how the computation of behavior from already seen representations of earlier experience to new situations, based on the similarity of the old and the new situation, is of a great significance. The latter describes the heart of TiMBL - the possibility to learn classification tasks from already seen examples. The software was jointly developed by the Induction of Linguistic Knowledge (ILK) Research Group of the Department of Communication and Information Sciences at the Tilburg University, The Netherlands and the CNTS - Language Technology Group by the Department of Linguistics at the University of Antwerp, Belgium. The complete C++ source code was released under the GNU General Public License 2 (GNU GPL) as published by the Free Software Foundation 3 . Its 1 http://ilk.uvt.nl/timbl/ 2 http://www.gnu.org/copyleft/gpl.html 3 http://www.fsf.org/ 44