06.08.2015 Views

A Wordnet from the Ground Up

A Wordnet from the Ground Up - School of Information Technology ...

A Wordnet from the Ground Up - School of Information Technology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.5. Hybrid Combinations 133fers <strong>from</strong> MSR GRW F discussed earlier (Section 3.4) only in <strong>the</strong> transformation appliedand in a slightly lower accuracy. A detailed presentation of <strong>the</strong> applied MSR RW F canbe found in (Piasecki et al., 2007b, Broda et al., 2008).The second phase begins with <strong>the</strong> extraction, for <strong>the</strong> given LU u, of a list S ofk LUs most semantically related to u, denoted MSRlist (u,k) . Any value of k will do,but we noticed that, for <strong>the</strong> MSR types we used, <strong>the</strong> percentage of LUs in a wordnetrelation to u begins to deteriorate around k = 20. Next, we need a classifier to selecta subset of S that includes near-synonyms and close hypernyms of u.Instead of using frequencies of lexico-syntactic features collected <strong>from</strong> a corpusdirectly as attributes in learning <strong>the</strong> classifier, we want to identify a set of complexfeatures that can give clues on <strong>the</strong> relation between two LUs. We intend to applya kind of knowledge-based, partially linguistically-motivated, transformation of <strong>the</strong>initial feature space into a new space of reduced complexity: fewer features and maybecondensed information on <strong>the</strong> LU relations of interest. For a pair of LUs, <strong>the</strong> valuesof attributes are calculated prior to training or testing. This is done via co-incidencematrices constructed <strong>from</strong> large corpora. We generally work with <strong>the</strong> same matricesas in <strong>the</strong> MSR RW F construction.In search for attributes, we drew on clues which can deliver information concerning<strong>the</strong> specificity of compared nouns, <strong>the</strong> extent to which <strong>the</strong>y mutually share lexicosyntacticfeatures, topic contexts in which <strong>the</strong>y occur toge<strong>the</strong>r and, last but not least,<strong>the</strong> value of <strong>the</strong>ir semantic relatedness. We now present <strong>the</strong> complete list of attributesused (a and b are noun LUs):1. semantic relatedness MSR(a, b) – <strong>the</strong> value returned by an MSR RW F ,2. co-ordination – <strong>the</strong> frequency of a’s and b’s co-occurrence in <strong>the</strong> same coordinatenoun phrase,3. modification by genitive – <strong>the</strong> frequency of a’s modification by b in <strong>the</strong> genitiveform,4. genitive modifier – <strong>the</strong> frequency of b’s modification by a in <strong>the</strong> genitive form,5. precision of adjectival features – <strong>the</strong> precision of repeating b’s adjectival featuresby <strong>the</strong> set of a’s features (for <strong>the</strong> calculation method, see formula 4.10 below),6. recall of adjectival features – <strong>the</strong> recall of repeating b’s adjectival features by<strong>the</strong> set of a’s features (for details, see formula 4.11),7. precision of modification by genitive – <strong>the</strong> precision of repeating b’s features,which express modification by a specific noun in genitive, by <strong>the</strong> similar featuresof a (<strong>the</strong> calculation method is similar to that in formula 4.10),

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!