06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

130 Chapter 4. Extracting Relation InstancesIt seems to be helpful to allow merging such patterns, maybe like this:(hypo:subst:case1) i inny (hyper:subst:case1).The results for ESP- and EST-, where <strong>the</strong>re are no such strict constraints, suggestsome increase in recall. Ano<strong>the</strong>r way, much more complicated, is to enrich <strong>the</strong> patternrepresentation, so that additional syntactic information (at least about nominal LUs)can be used.The list of acquired instances cannot be directly imported to plWordNet. First ofall, <strong>the</strong> list is flat. There is no information on synsets. The percentage of erroneousLU pairs on <strong>the</strong> lists (such as 63% for EST+nm) is too high to trust <strong>the</strong> list as sourceof automatic expansion of <strong>the</strong> plWordNet hypernymy structure. Also, many positiveLU pairs represent in fact quite remote hypernymic links.These observations show <strong>the</strong> drawbacks, but <strong>the</strong>re also are pluses. EST+nm extracted3700 hypernymic LU pairs (37% of <strong>the</strong> 10000 LU pairs). This informationcan be combined with MSR G RW F , producing higher values for wordnet relation instances.The MSR alone does not say what kind of relation made two LU closelysemantically related. The information acquired by Estratto sheds light on this issue.Section 4.5.3 presents a fairly succesful algorithm based on this reasoning. A manualcomparison of <strong>the</strong> LU pairs extracted by Estratto and <strong>the</strong> three manual patterns revealsthat both sets are disjoint to some extent. We noted earlier that manual patterns aremore expressive and can find hypernymic instances in language construction which areinaccessible for <strong>the</strong> present Estratto patterns. This can be changed in <strong>the</strong> future extensionsof Estratto, but for now we used both types of patterns in <strong>the</strong> hybrid algorithmof plWordNet expansion in Section 4.5.3.4.5 Hybrid Combinations: Patterns, Distributional Semanticsand ClassifiersWe noted at <strong>the</strong> end of Section 3.4.5 that Measures of Semantic Relatedness [MSRs]can recognize semantically related LUs with an accuracy approaching human performance.Still, MSRs produce lists of <strong>the</strong> k LUs most semantically related to <strong>the</strong> givenLU x [MSRlist (x,k) ] with few instances of wordnet relations, and <strong>the</strong>y do not knowhow to distinguish <strong>the</strong> direction of a relation. We named two ways of compensatingfor <strong>the</strong>se drawbacks: introduce a classifier operating on MSRlists (x,k) , capable ofdifferencing relations, or combine a MSR with o<strong>the</strong>r sources of knowledge, includinglexico-syntactic patterns or <strong>the</strong> existing wordnet structure. This subsection willexamine both possibilities.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!