06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.5. Hybrid Combinations 163seem low but <strong>the</strong> number of possible attachments is set to a ra<strong>the</strong>r high 5 – to show<strong>the</strong> linguist more extracted senses – so it impairs precision. The measure P ≥1 alsoshows that at least one proper attachment area was identified in <strong>the</strong> majority of LUs.Even for <strong>the</strong> worst random sample, proposals for 59.12% of new lemmas were foundworth examining, as <strong>the</strong>y include helpful suggestions. The numbers do not show how<strong>the</strong> tool can inspire <strong>the</strong> user, draw her attention to less obvious or domain-dependentsenses, reveal peculiarities in <strong>the</strong> wordnet state and so on.L All S All S All S+W One S One W One S+W Best P ≥10 26.65 7.90 16.46 45.80 16.24 34.96 42.811 35.76 14.50 24.21 58.73 28.96 47.81 61.192 42.87 21.39 31.20 67.69 40.51 57.72 75.023 48.31 27.36 36.93 73.58 51.08 65.33 81.965 53.52 34.78 43.34 78.46 58.51 71.14 86.186 57.59 43.59 49.99 81.52 64.58 75.31 89.827 61.09 49.90 55.01 83.56 70.45 78.75 91.168 63.38 53.71 58.13 84.47 73.58 80.47 92.499 65.27 56.55 60.53 85.03 75.73 81.62 93.1210 66.07 58.86 62.16 85.26 78.28 82.70 93.54Table 4.6:The accuracy [%] of plWordNet reconstruction; L – <strong>the</strong> distance <strong>from</strong> <strong>the</strong> original synset,S and W mean strong and weak fitness, respectivelyWordNet reconstructionIn <strong>the</strong> automatic evaluation, we wanted to check <strong>the</strong> ability of <strong>the</strong> AAA algorithmto reconstruct parts of plWordNet. The method is meant to expand <strong>the</strong> existing corestructure of a wordnet, so we identified 1527 LUs in <strong>the</strong> lower parts of <strong>the</strong> hypernymystructure as a basis for <strong>the</strong> evaluation. In order to introduce as little bias as possible,10 LUs were removed <strong>from</strong> <strong>the</strong> plWordNet structure in one step of <strong>the</strong> evaluation. TheC H classifier component was trained without <strong>the</strong> removed LUs and <strong>the</strong> AAA algorithmwas run to attach <strong>the</strong> processed LUs.There are many synsets in plWordNet with a single LU. This makes <strong>the</strong> evaluationof LUs in such synsets problematic. If we removed singleton synsets, we wouldartificially – and dramatically – alter <strong>the</strong> overall structure of plWordNet and so introducean unwanted bias. That is why we decided to remove only <strong>the</strong> LUs and to leave emptysynsets in <strong>the</strong> modified plWordNet.We assumed three strategies for evaluating <strong>the</strong> AAA algorithm’s proposals:• All – all proposals are evaluated;• One – only single highest-scoring attachment site is evaluated; this strategy was

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!