A Wordnet from the Ground Up

More documents

Recommendations

Info

4.5. Hybrid Combinations 1374.5.2 Benefits of classifier-based filtering for wordnet expansionThe MSR for the experiments and the values of all attributes were generated fromtwo corpora combined – both were used in other our experiments. Their more detaileddescription can be found in Section 3.4.5. One was IPIC with ≈ 254 milliontoken. The other was the corpus of the daily Rzeczpospolita with ≈ 116 million token(Rzeczpospolita, 2008).MSR RW F was the same as that proposed by Piasecki et al. (2007b). Its constructionwas based only on two types of lexico-morphosyntactic constraints: modification bya specific adjective or adjectival participle (AdjC in Section 3.4.3, page 67), and coordinationwith a specific noun (NcC).All nouns, adjectives and adjectival participles from the combined corpora wereused accordingly as the lexical elements of constraint instances. MSR RW F provideda description of 13298 nominal LUs and achieved the accuracy of almost 91% inWBST+H, see Section 3.3.1 generated from the plWordNet version June 2008.We used plWordNet as the main source of training/test examples. Following themain line of the experimental paradigm of (Snow et al., 2005), we generated fromplWordNet two sets of LU pairs: Known Hypernyms [KH] and Known Non-Hypernyms[NH]. Our goal is to support linguists by presenting relevant pairs of LUs. Similarlyto (Snow et al., 2005) we constructed the set of Known Hypernyms from LU pairs〈a, b〉 where b is a direct hypernyms of a or a hypernymic ancestor of a. In contrastwith (Snow et al., 2005), we allowed only for the limited hypernymic distances in allKH sets. Aiming at a tool to support linguists, we did not want remote associationsamong classified positively LU pairs.Hypernymy path length guided experiments with two different divisions of the twogroups. We wanted to investigate to what degree we can distinguish closer and moreremote hypernyms. We generated four data sets from the plWordNet version April2008:H the set of pairs: direct hypernym/hyponym (2967 pairs) – in all experiments H wasincluded in KH,P2 pairs of LUs connected by the path of the two arcs in the hypernymy graph –P2 was included in KH (2060 pairs),P3 pairs of LUs connected by a path of three or more hypernymy arcs, in NH (1176pairs),R pairs of words randomly selected from plWordNet in such way that no direct hypernymypath connects them, NH (55366 pairs, including co-hyponyms).After initial experiments, we noticed that the border space between typical elementsof KH and NH is not populated well enough, especially considering its importance
138 Chapter 4. Extracting Relation Instancesfor Machine Learning. We manually annotated randomly selected pairs of LUs whichoccurred on MSRlists (a,20) for the LUs described by the MSR.From this selection, 1159 pairs classified as not relevant were collected into a setE. In some experiments, we added E to NH, see below.We experimented with two training sets produced by combining our regular datasets. Test sets were excluded randomly from training sets during tenfold cross-validation.Training sets are named in Table 4.4 according to the following description scheme:KH 1 + . . . + KH n ,NH 1 + . . . + NH mi.e. first the sets comprising KH are listed, next the sets from NH. The training setH+P2,P3+R includes only pairs extracted from plWordNet. It consists of 5027 KHpairs (H+P2) and 56531 NH pairs (P3+R). Tests on this set were done only on dataalready present in plWordNet. It is also more difficult than the sets used in (Snowet al., 2005), because the classifier is expected to distinguish between close hypernymsand more indirect hypernymic ancestors (P3 included in NH).Because plWordNet (the version June 2008) was still small, the second training setwas extended with the set E of manually classified pairs. We added only negative pairs,assuming that positive examples are well represented by pairs from plWordNet, whilemore difficult negative examples are hidden in the huge number of negative examplesautomatically extracted from plWordNet. The second training set consists of 5027 KH(H+P2) and 57690 NH (P3+R+E).In the experiments, we used Naïve Bayes (Mitchell, 1997) and two types of decisiontrees, C4.5 (Quinlan, 1986) and Logistic Model Tree [LMT] (Landwehr et al., 2003),all in the versions implemented in the Weka system (Witten and Frank, 2005). NaïveBayes classifiers are probabilistic, C4.5 is rule-based, and LMT combines rule-basedstructure of a decision tree with logistic regression in leaves. In order to facilitate acomparison of classifiers, we performed all experiments on the same training-test dataset. Because we selected C4.5 as our primary classifier, and we generated examplesfrom the same corpus (so the frequencies occurring as values of some attributes couldbe compared directly), we did not introduce any data normalisation or discretisation.The range of data variety was also limited by the corpus used. The application of thesame data to the training of a Naïve Bayes classifier resulted in a bias towards its morememory-based-like behaviour. According to the clear distinctions in the main groupof the applied data sets, however, the achieved result was positive, see Table 4.4.All experiments were run in the Weka environment (Witten and Frank, 2005). Ineach case, we applied tenfold cross-validation; the average results appear in Table 4.4.Because some classifiers, for example C4.5, are known to be sensitive to the biasedproportion of training examples for different classes (here, only two), we also testedthe application of random subsampling of the negative examples (NH) in the trainingdata. The ratio KH:NH in the original sets is around 1:10. In some experiments the
Page 1 and 2:
A Wordnetfrom the Ground Up
Page 3 and 4:
Work financed by the Polish Ministr
Page 7 and 8:
6 Prefaceheartfelt thanks go to all
Page 9:
8 Chapter 1. Motivation, Goals, Ear
Page 12 and 13:
1.1. Motivation 11[a] special form
Page 14 and 15:
1.1. Motivation 13Affect (Strappara
Page 16 and 17:
1.2. The Goals of the plWordNet Pro
Page 18 and 19:
1.2. The Goals of the plWordNet Pro
Page 20 and 21:
1.3. Early Decisions 19Merge Model:
Page 22:
1.3. Early Decisions 214. On the ot
Page 25 and 26:
24 Chapter 2. Building a Wordnet Co
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
48 Chapter 3. Discovering Semantic
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88: 86 Chapter 3. Discovering Semantic
Page 103 and 104: 102 Chapter 4. Extracting Relation
Page 137: 136 Chapter 4. Extracting Relation
Page 167 and 168: 166 Chapter 5. Polish WordNet Today
Page 186 and 187: Appendix ATests for Lexico-semantic
Page 188 and 189:
187Test for adjectives (T. IX)1. p1
Page 190 and 191:
189RelatednessTest for nouns (T. XV
Page 192 and 193:
BibliographyAgarwal, Abhaya and Alo
Page 194 and 195:
Bibliography 193on Deep Lexical Acq
Page 196 and 197:
Bibliography 195Derwojedowa, Magdal
Page 198 and 199:
Bibliography 197Grefenstette, Grego
Page 200 and 201:
Bibliography 199Kurc, Roman. (2008)
Page 202 and 203:
Bibliography 201Mohammad, Saif and
Page 204 and 205:
Bibliography 203. (2006) “The pot
Page 206 and 207:
Bibliography 205and Technology 7(1-
Page 208 and 209:
List of Tables2.1 The size of the c
Page 210 and 211:
List of Figures2.1 The LU perspecti
Page 212 and 213:
List of Figures 2114.16 Completely
Page 214 and 215:
Index 213CBC, see Clustering by Com
Page 216 and 217:
Index 215169, 177, 178, 180, 182hyp
Page 218 and 219:
Index 217mutual hypernymy, 24Mutual
Page 220 and 221:
Index 219SUMO, 14Supported Vector M
Page 222:
A language without a wordnet is at
show all

A Wordnet from the Ground Up

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?