06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

72 Chapter 3. Discovering Semantic RelatednessDuring <strong>the</strong> experiments performed by (Broda et al., 2008), a linear combination ofseparate matrices, that is, a linear combination of two MSRs, gave better results than<strong>the</strong> joint matrix ANmod+AAdv+AA. However, as <strong>the</strong> issue of extracting MSRs on <strong>the</strong>basis of <strong>the</strong> combination of separate matrices still requires more in depth research, wedo not present here a repeated experiment of this kind.The results of <strong>the</strong> manual evaluation of <strong>the</strong> constraints for nominal LUs, presentedin (Piasecki and Radziszewski, 2009), appear in Table 3.6. For each constraint templateand <strong>the</strong> appropriate list of lexical elements, <strong>the</strong> total number of matches in IPIC wascalculated and based on that a sample of matches was randomly drawn. Each match of<strong>the</strong> lexicalised morphosyntactic constraint in <strong>the</strong> sample was extracted as a triple: <strong>the</strong>sentence, <strong>the</strong> described LU and <strong>the</strong> lexical elements. The positions of both expressionsin <strong>the</strong> sentence were marked. The task of <strong>the</strong> evaluator (one of <strong>the</strong> co-authors) was toanalyse if <strong>the</strong> relation described by <strong>the</strong> constraint holds for <strong>the</strong> given pair in <strong>the</strong> givensentence. The sample sizes were chosen according to <strong>the</strong> method described in (Israel,1992), in such a way that <strong>the</strong> results of <strong>the</strong> sample evaluation can be ascribed to <strong>the</strong>whole set with a 95% confidence level.ConstraintsAdjC NcC NmgC VsbCPrecision [%] 97.39 67.78 92.36 80.36Table 3.6:The accuracy of <strong>the</strong> lexico-morphosyntactic constraintsAs one could expect, <strong>the</strong> highest accuracy was achieved for <strong>the</strong> AdjC constraint,based strongly on agreement. The tagger caused <strong>the</strong> majority of <strong>the</strong> errors. In somecases an adjective located between two nouns of <strong>the</strong> same values of <strong>the</strong> analysedgrammatical categories was mistakenly associated with <strong>the</strong> wrong noun. The goodresult of NmgC was in large extent artificially increased by <strong>the</strong> aforementioned loosedefinition of <strong>the</strong> genitive nominal modifier assumed in NmgC and its evaluation. Forexample, we did not distinguish genitive arguments of a gerund which modifies <strong>the</strong>head <strong>from</strong> <strong>the</strong> proper genitive modifiers of <strong>the</strong> head. Still, it is worth noting thatwe have achieved relatively good results of subject identification using a fairly simpleconstraint mechanism VsbC.As <strong>the</strong> majority constraints for verbal and adjectival LUs are symmetrical or verysimilar to those for nominal LUs, we expect similar accuracy.3.4.4 Transformation based on rank weightingIn <strong>the</strong> co-incidence matrix constructed in step 2 (Section 3.4.2, p. 65) as a result of<strong>the</strong> general MSR extraction process, each LU is described by a vector of features that

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!