06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

152 Chapter 4. Extracting Relation Instances3. Remove <strong>from</strong> synAtt(y) all S such that maxScore(x, S) < minMSR4. Return <strong>the</strong> top maxSens subgraphs <strong>from</strong> synAtt(x) according to <strong>the</strong>irmaxScore values; in each, mark <strong>the</strong> synset S with <strong>the</strong> highest score(x, S)The radius r was set to 2 (Phase I), because we observed that no extractionmethod used can distinguish between direct hypernyms and just close hypernyms.The δ function is a mean of non-linear quantisation <strong>from</strong> <strong>the</strong> strength of evidence to<strong>the</strong> decision. We require more yes votes for larger synsets, fewer votes for smallersynsets, but always more than one ‘full vote’ must be given – more than one synsetmember voting yes. The parameter h of <strong>the</strong> δ template relates <strong>the</strong> function to what isconsidered to be a ‘full vote’. For weak fit, h is set to <strong>the</strong> value which signals a veryhigh relatedness for <strong>the</strong> MSR used.In Phase II we identify continuous areas (connected subgraphs) in <strong>the</strong> hypernymygraph, those which fit <strong>the</strong> new lemma x. For each area we find <strong>the</strong> local maximumof <strong>the</strong> score function for x. We keep all subgraphs with <strong>the</strong> synset of <strong>the</strong> maximumscore based on <strong>the</strong> strong fit (<strong>the</strong> detail omitted above). From those based on <strong>the</strong>weak fit, we only keep <strong>the</strong> subgraphs above some heuristic threshold of <strong>the</strong> reliableMSR result. We also save for <strong>the</strong> linguists only a limited number of <strong>the</strong> best-scoringsubgraphs (maxSens = 5 – it can be a parameter of <strong>the</strong> application). We do presentall subgraphs with <strong>the</strong> top synset fit based on <strong>the</strong> strong fit.The WordNet Weaver applicationThe WordNet Weaver [WNW] is an expansion of plWordNetApp (Piasecki and Koczan,2007), a wordnet editor developed for <strong>the</strong> plWordNet project and used in its construction(Section 2.4). A separate screen groups most of <strong>the</strong> added user-perceived functionality– see Fig. 4.8. A linguist sees a list of new lemmas (not yet in <strong>the</strong> wordnet). A userselectednew lemma u is shown as a green oval. The existing LUs that u fits appearin yellow, orange, red and vivid purple in <strong>the</strong> increasing order of fitness score. Strongand weak fit is also distinguished by shapes, respectively octagon and rectangle. Allfitting synsets toge<strong>the</strong>r with hypo/hypernymy (arrows point to <strong>the</strong> hypernyms) links areinitially visible to <strong>the</strong> user: this presents <strong>the</strong> context of <strong>the</strong> system’s every attachmentdecision. Only one local maximum per a connected hypernymy subgraph is markedby a blue border. Local maxima – <strong>the</strong> proposed attachment centres – are graphicallylinked with <strong>the</strong> green oval of <strong>the</strong> new lemma (marked by lines ending with smallcircles).The linguist can select any synset present on <strong>the</strong> screen and <strong>the</strong>n choose a type oflexico-semantic relation, including synonymy, by which it will be associated with <strong>the</strong>new lemma. A wrong proposal can be rejected, too; in that case, <strong>the</strong> linguist is asked toselect a type and a possible cause of <strong>the</strong> error. Adding and rejecting removes <strong>the</strong> circle-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!