06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

182 Chapter 5. Polish WordNet Today and Tomorrow• evaluation of <strong>the</strong> concept of linguistically motivated wordnet structure,• fur<strong>the</strong>r development of <strong>the</strong> algorithms of automatic extraction of semantic relationsand <strong>the</strong> methods of semi-automated wordnet construction.It may not be enough to justify <strong>the</strong> assumed wordnet model by analytical consideration,for example in comparison to <strong>the</strong> psychologically oriented concept of <strong>the</strong>Princeton WordNet. Sections 2 and 5.3 offer some discursive support. What is needed,clearly, is practice. Several Polish universities have been granted free research licences.The plWordNet web pages (plwordnet.pwr.wroc.pl) have had about 12000 visitors(based on unique IP addresses, more than 180000 visits). The real test is yet to come:a range of experiments in various applications of plWordNet.We made a first step ourselves: we ran several Word Sense Disambiguation algorithmson Polish using plWordNet (Baś, 2008, Baś et al., 2008). 13 lemmas representing54 LUs altoge<strong>the</strong>r were selected in such a way that <strong>the</strong> subsequent lemmaspose different problems with respect to hyponymy and polysemy. A small training/testsubcorpus was collected, including sentences which represent different senses of <strong>the</strong>lemmas. The results are very promising in spite of <strong>the</strong> fine-grained sense distinctionobserved for several lemmas. Much more is needed. We plan to work on plWordNet,and we will actively publicise <strong>the</strong> system. We offer free research licences to anyonewho has a research plan that includes plWordNet.Our main wordnet development tool, <strong>the</strong> WordNet Weaver [WNW], works onlywith <strong>the</strong> hypernymy structure. It allows for editing synsets and hypernymic linkswhile adding new lemmas to plWordNet. The hypernymy structure is necessarily shallowerfor adjectival and verbal LUs, so one should leverage all types of links betweensynsets and LUs in order to collect evidence for <strong>the</strong> most appropriate attachment point.Lexico-semantic relations o<strong>the</strong>r than hypernymy can also be beneficial for expanding<strong>the</strong> nominal part of plWordNet.In WNW, any change in <strong>the</strong> lexico-semantic relations o<strong>the</strong>r than hypernymy ispossible but <strong>from</strong> <strong>the</strong> main plWordNetApp, not <strong>from</strong> <strong>the</strong> WNW graphical browser.The algorithm of Activation-area Attachment [AAA] very often selects holonyms aspossible attachment points. All this is especially limiting for verbs, and makes addingadjectival LUs almost impossible. We plan to enable editing of all types of lexicosemanticrelation via WNW graphs.The present model behind <strong>the</strong> AAA is heuristic. We plan to investigate its possiblegeneralisations on <strong>the</strong> basis of <strong>the</strong> statistical properties of <strong>the</strong> different evidence andrelation graph properties.Besides WNW, <strong>the</strong>re are also open research questions concerning <strong>the</strong> work of itscomponents. There is no visible threshold in <strong>the</strong> values produced by <strong>the</strong> proposedMeasures of Semantic Relatedness [MSR] based on <strong>the</strong> Rank Weight Function whichdistinguishes closely related lemmas <strong>from</strong> o<strong>the</strong>r lemmas. We plan to explore <strong>the</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!