06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.3. Lessons Learned 177In <strong>the</strong> end, we took <strong>the</strong> dictionary of Piotrowski and Saloni (1999) to complete<strong>the</strong> set of lemmas for <strong>the</strong> core plWordNet. This was necessary during <strong>the</strong> semiautomaticexpansion of <strong>the</strong> nominal part of plWordNet (Section 4.5.4). We also brieflyexperimented with translating into Polish LUs <strong>from</strong> <strong>the</strong> upper levels of <strong>the</strong> nominalhypernymy structure in PWN. This resulted in many artificial LUs – unlexicalisedmeaning descriptions. We were more successful with a simple kind of “machine translation”applied to <strong>the</strong> verbal and adjectival parts of PWN. For each lemma occurringin <strong>the</strong> preselected part of PWN, we added to <strong>the</strong> list all its translations found in <strong>the</strong>electronic version of <strong>the</strong> dictionary of Piotrowski and Saloni (1999). We did not try todisambiguate translations, because we were interested in completing <strong>the</strong> list of lemmas,which was fur<strong>the</strong>r processed manually. Moreover, <strong>the</strong> dictionary is a small pocket dictionary,so it should contain only <strong>the</strong> most general lemmas, and thus it acts as a kindof filter which eliminates translations for all infrequent LUs.The weakness of <strong>the</strong> initial lemma list was problematic when expanding <strong>the</strong> coreplWordNet via <strong>the</strong> WordNet Weaver (Sections 4.5.3 and 4.5.4): missing hypernyms– LUs <strong>from</strong> <strong>the</strong> upper parts of <strong>the</strong> hypernym structure – affected <strong>the</strong> accuracy of <strong>the</strong>algorithm.The assumption that <strong>the</strong> LU is <strong>the</strong> centrepiece of plWordNet (Section 2.1) wasnot uncontroversial: synsets are key elements in most applications of a wordnet. Theresulting structure, however, is not without advantages. The assumption is well motivatedby <strong>the</strong> linguistic tradition and <strong>the</strong> lexicographic practice. The rules for addingnew LUs, instances of semantic relations and, especially, LUs to synsets were systematicallydefined following <strong>the</strong> established linguistic tradition, and implemented in <strong>the</strong>plWordNetApp application (Section 2.4) as automatically filled substitution tests.Synonymy is an elusive relation, not easily defined, yet it underlies <strong>the</strong> centralnotion of a synset. The construction of synsets in many wordnets is thus based on impreciserules and on references to <strong>the</strong> extralinguistic properties of LUs. In plWordNet,a synset is defined through <strong>the</strong> lexico-semantic relations among its members; moreprecisely, it is <strong>the</strong> o<strong>the</strong>r way round – <strong>the</strong> similarity of several LUs due to <strong>the</strong> sharedset of lexico-semantic relation targets 6 makes <strong>the</strong>m candidates for <strong>the</strong> same synset.In <strong>the</strong> identification of synsets, hypernymy and meronymy have been distinguished asdefining <strong>the</strong> structure (Section 2.1). This means that plWordNet (and generally anywordnet designed according to our method of defining synsets) is a network of LUsconnected by lexico-semantic relations. A synset is in this case just a “shortcut” for<strong>the</strong> fact that two or more LUs share <strong>the</strong> same relations. The structure of plWordNetis based on <strong>the</strong> lexico-semantic relations among LUs, which are well established inlinguistics and for which substitution tests are well known, so <strong>the</strong> linguists are likelyto make highly consistent decisions.6 For a LU x, all LUs in some lexico-semantic relation with x are such targets.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!