A Wordnet from the Ground Up
A Wordnet from the Ground Up - School of Information Technology ...
A Wordnet from the Ground Up - School of Information Technology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
58 Chapter 3. Discovering Semantic Relatednessdifferentiate <strong>the</strong> LUs in a synset <strong>from</strong> all o<strong>the</strong>r LUs similar but not synonymous, among<strong>the</strong>m co-hyponyms. Any such MSR must <strong>the</strong>refore distinguish closely related LUs, notonly those with very different meaning.In modifying <strong>the</strong> WBST+H test we assumed that we needed to construct <strong>the</strong> answerset A so that non-synonyms are closer in meaning to <strong>the</strong> correct answer a i than it is<strong>the</strong> case in WBST+H. Obviously, <strong>the</strong>y cannot be synonyms of ei<strong>the</strong>r a i or Q, but <strong>the</strong>yought to be related to both. We need to select <strong>the</strong> non-synonyms among LUs similarto s and to Q. In order to achieve this, we have decided to leverage <strong>the</strong> structure of<strong>the</strong> wordnet in <strong>the</strong> determination of similarity and to construct a semantic similarityfunction SSF W N based on <strong>the</strong> plWordNet hypernymy structure:SSF W N : S × L → R (3.2)where S is a set of synsets, L — lexical units, R — real numbers.SSF W N takes a synset S (e.g. including Q and a i ) and a lexical unit x (e.g.a detractor), and returns <strong>the</strong> semantic similarity value.During <strong>the</strong> generation of <strong>the</strong> modified Enhanced WBST [EWBST], non-synonymsare still selected at random but only <strong>from</strong> <strong>the</strong> set of LUs broadly similar to Q and a i .The acceptable values of SSF W N (S Q , x) are lower than some threshold sim t if <strong>the</strong>synset S Q contains Q and a i , and x is a detractor. We tested several wordnet-basedsimilarity functions (Agirre and Edmonds, 2006), here implemented using plWordNet’shypernymy structure, and achieved <strong>the</strong> best result in a generated test with <strong>the</strong> followingfunction:SSF W N = p min(3.3)2dp min is <strong>the</strong> length of a minimal path between two LUs in plWordNet, and d is a maximaldepth of <strong>the</strong> hypernymy hierarchy in <strong>the</strong> current version of plWordNet. The similaritythreshold sim t = 2 for this function has been established experimentally. To achieveconsistency between tests generated <strong>from</strong> different versions of plWordNet, we decidedto set <strong>the</strong> sim t to value corresponding to four arcs in hypernymy hierarchy.The hypernymy structure of nouns in plWordNet does not have a single root,because in plWordNet we have not introduced any artificial common root nodes for allnominal LUs 4 Many methods of similarity computation require a root, however, so wehave introduced a virtual one for <strong>the</strong> sake of <strong>the</strong> similarity computation, and linked toit all trees in <strong>the</strong> hypernymy forest.We noticed that <strong>the</strong> random selection of LU detractors based any similarity measuretends to favour LUs in <strong>the</strong> hypernymy subtrees o<strong>the</strong>r than Q, if Q is located near <strong>the</strong>root. The number of LUs linked by a short path across <strong>the</strong> root is much higher than4 The same is <strong>the</strong> case for verbal and adjectival LUs, whose hypernymy structures are also partial andquite shallow.