06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

164 Chapter 4. Extracting Relation Instancesintroduced mainly for comparison with o<strong>the</strong>r approaches (but it is unnatural <strong>from</strong><strong>the</strong> point of view of <strong>the</strong> linguists’ work);• Best P ≥1 – one closest attachment site is evaluated (similarly to <strong>the</strong> P ≥1 inSec. 4.5.4).Table 4.6 presents <strong>the</strong> result with a distinction between strong (marked S) andweak fitness (W). As expected, <strong>the</strong> accuracy of suggestions based on strong fitness issignificantly higher <strong>the</strong>n for weak fitness. Because of <strong>the</strong> intended use, we assumedthat not only direct hits are useful – if <strong>the</strong> proposal is close enough to <strong>the</strong> correct placein plWordNet structure, <strong>the</strong>n it is also a valuable suggestion. The same applies if <strong>the</strong>reis meronymy or holonymy between <strong>the</strong> suggested and correct synset.The results are encouraging. Almost half of <strong>the</strong> suggestions based on strong fitnessare in <strong>the</strong> close proximity of <strong>the</strong> correct place in wordnet structure. If making onlyone suggestion was required, <strong>the</strong> accuracy was boosted to 73.58%. For our goal, thisis an artificial constraint, but it shows how well <strong>the</strong> AAA algorithm would behave ina fully unsupervised way. Our ultimate goal, though, is to create a tool for supportinga linguist’s work, so <strong>the</strong> result for Best P ≥1 strategy shows more meaningful data:for how many words <strong>the</strong>re is at least one useful suggestion. The AAA algorithmssuggested at least one strictly correct attachment site for 42.81% words, or for 81.96%words if we consider that close proposals are also useful.Comparison to o<strong>the</strong>r ways of automatic expanding a wordnet can be misleading.That is because our primary goal was to construct a tool that facilitates and streamlines<strong>the</strong> linguists’ work. Still, even if we compare our automatic evaluation with <strong>the</strong> resultsin (Widdows, 2003) during comparable tests on <strong>the</strong> PWN, our results seem to be better.For example, we had 34.96% for <strong>the</strong> highest-scored proposal (One S+W in Table 4.6),while Widdows reports a 15% best accuracy for a “correct classifications in <strong>the</strong> top 4places” (among <strong>the</strong> top 4 highest proposals). Our similar result for <strong>the</strong> top 5 proposalsis even higher, 42.81%. The best results reported by Alfonseca and Manandhar (2002)and Witschel (2005) are also at <strong>the</strong> level of about 15%, but were achieved in tests ona much smaller scale. Witschel also performed tests only in two selected domains. Thealgorithm of Snow et al. (2006), contrary to ours, can be applied only to probabilisticevidence.We made two assumptions: attachment based on <strong>the</strong> activation area and <strong>the</strong> simultaneoususe of multiple knowledge sources. The assumption appears to have beensuccessful in boosting <strong>the</strong> accuracy above <strong>the</strong> level of <strong>the</strong> MSR-only decisions (whichis roughly represented in our approach by weak fitness).WNW seems to improve <strong>the</strong> linguist’s efficiency a lot, but longer observations arenecessary for a reliable justification.The AAA algorithm is overburdened with parameters. Fur<strong>the</strong>r research is requiredto find ei<strong>the</strong>r a simplified form or an effective method of parameters optimization.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!