06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

58 Chapter 3. Discovering Semantic Relatednessdifferentiate <strong>the</strong> LUs in a synset <strong>from</strong> all o<strong>the</strong>r LUs similar but not synonymous, among<strong>the</strong>m co-hyponyms. Any such MSR must <strong>the</strong>refore distinguish closely related LUs, notonly those with very different meaning.In modifying <strong>the</strong> WBST+H test we assumed that we needed to construct <strong>the</strong> answerset A so that non-synonyms are closer in meaning to <strong>the</strong> correct answer a i than it is<strong>the</strong> case in WBST+H. Obviously, <strong>the</strong>y cannot be synonyms of ei<strong>the</strong>r a i or Q, but <strong>the</strong>yought to be related to both. We need to select <strong>the</strong> non-synonyms among LUs similarto s and to Q. In order to achieve this, we have decided to leverage <strong>the</strong> structure of<strong>the</strong> wordnet in <strong>the</strong> determination of similarity and to construct a semantic similarityfunction SSF W N based on <strong>the</strong> plWordNet hypernymy structure:SSF W N : S × L → R (3.2)where S is a set of synsets, L — lexical units, R — real numbers.SSF W N takes a synset S (e.g. including Q and a i ) and a lexical unit x (e.g.a detractor), and returns <strong>the</strong> semantic similarity value.During <strong>the</strong> generation of <strong>the</strong> modified Enhanced WBST [EWBST], non-synonymsare still selected at random but only <strong>from</strong> <strong>the</strong> set of LUs broadly similar to Q and a i .The acceptable values of SSF W N (S Q , x) are lower than some threshold sim t if <strong>the</strong>synset S Q contains Q and a i , and x is a detractor. We tested several wordnet-basedsimilarity functions (Agirre and Edmonds, 2006), here implemented using plWordNet’shypernymy structure, and achieved <strong>the</strong> best result in a generated test with <strong>the</strong> followingfunction:SSF W N = p min(3.3)2dp min is <strong>the</strong> length of a minimal path between two LUs in plWordNet, and d is a maximaldepth of <strong>the</strong> hypernymy hierarchy in <strong>the</strong> current version of plWordNet. The similaritythreshold sim t = 2 for this function has been established experimentally. To achieveconsistency between tests generated <strong>from</strong> different versions of plWordNet, we decidedto set <strong>the</strong> sim t to value corresponding to four arcs in hypernymy hierarchy.The hypernymy structure of nouns in plWordNet does not have a single root,because in plWordNet we have not introduced any artificial common root nodes for allnominal LUs 4 Many methods of similarity computation require a root, however, so wehave introduced a virtual one for <strong>the</strong> sake of <strong>the</strong> similarity computation, and linked toit all trees in <strong>the</strong> hypernymy forest.We noticed that <strong>the</strong> random selection of LU detractors based any similarity measuretends to favour LUs in <strong>the</strong> hypernymy subtrees o<strong>the</strong>r than Q, if Q is located near <strong>the</strong>root. The number of LUs linked by a short path across <strong>the</strong> root is much higher than4 The same is <strong>the</strong> case for verbal and adjectival LUs, whose hypernymy structures are also partial andquite shallow.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!