06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

132 Chapter 4. Extracting Relation InstancesKennedy (2006) analysed several modified versions of this method. The modificationsconcerned <strong>the</strong> data in <strong>the</strong> training/test corpus (varying <strong>the</strong> positive-to-negativepair ratio and <strong>the</strong> method of undersampling negative examples) and small differencesin <strong>the</strong> way of formatting dependency paths. An additional classifiers based on a versionof <strong>the</strong> Supported Vector Machines algorithm (Joachims, 2002) was applied too,achieving <strong>the</strong> best F-score 0.633 for a combination of a classifier and filtering basedon Roget’s Thesaurus.Zhang et al. (2006) explored different types of syntactic dependencies at differentlevels of granularity in <strong>the</strong> construction of classifiers to find occurrences of relationshipsbetween named entities. Five main kinds of relationships with 24 different subtypeswere considered. This approach is broadly similar but <strong>the</strong> different objective makes acomparison of <strong>the</strong> results difficult.ML methods of extracting hypernymy pairs usually take lexico-syntactic featuresdirectly to build a classifier. Tens of thousands of features are typical, each carryingvery sparse information. Most of such information “tells” <strong>the</strong> classifier about variousaspects of semantic relatedness. Features that point to specific lexico-semantic relationsare rare. Section 3.4.5 notes that near-synonyms and close hypernyms/hyponyms of anLU u would be expected close to <strong>the</strong> top of <strong>the</strong> list of LUs most semantically related tou, generated by a good MSR. An application of a syntactic analyser is also assumed:a deep parser in (Zhang et al., 2006) or a shallow dependency parser in (Snow et al.,2005, Kennedy, 2006). For many languages such tools are not available yet.We propose to extract hypernymy pairs by relaxing both assumption. There aretwo phases (Piasecki et al., 2008):1. extract <strong>the</strong> generic relation of semantic relatedness modelled by some MSR,2. identify hypernymy instances – pairs of LUs – <strong>from</strong> <strong>the</strong> MSR’s results.The first phase can use all kinds of information that describes <strong>the</strong> semantics ofLUs, depending on <strong>the</strong> MSR extraction method. The second phase concentrates ongroups of semantically related LUs and applies specialised tests that distinguish specificlexico-semantic relations as subtypes of semantic relatedness. The tasks of <strong>the</strong> firstphase are preliminary filtering and problem complexity reduction, so during <strong>the</strong> secondphase a broader variety of ML methods can be used. An MSR of good accuracy can(by way of its high values) associate LUs that extremely rarely occur close by in<strong>the</strong> corpus at hand. Note that such occurrences are <strong>the</strong> precondition on any patternbasedmethod. MSRs condense information o<strong>the</strong>rwise distributed among many lexicosyntacticpatterns; in phase 2 we can concentrate on <strong>the</strong> most promising pairs.The only assumption is <strong>the</strong> availability of a highly accurate MSR. During experimentswe used an MSR based on <strong>the</strong> Rank Weight Function transformation [MSR RW F ],an earlier version of <strong>the</strong> Generalised RWF presented in Section 3.4.4. MSR RW F dif-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!