4.2. Benefits of Handwritten Patterns for <strong>Wordnet</strong> Expansion 105or(and(equal(cas[0],{nom}),rlook(1,end,$X, inter(flex[$X],{adjectival participles, noun,pronouns, verbal grammatical classes }) ),equal(base[$X],{"być"}),rlook($+1X,end,$Y,or(inter(flex[$Y], {adjectival passive participle,noun, pronouns, verbal grammatical classes }),and( equal(flex[$Y],{prep}),equal(cas[$Y],{inst})))),inter(flex[$Y],{subst,ger,depr}),equal(base[$Y],{"NP2"}),equal(cas[$Y],{inst}),equal(nmb[$Y],nmb[0])),a symmetrical condition for <strong>the</strong> right context)Figure 4.1: The essentials of <strong>the</strong> JestInst pattern implementation in JOSKIPIthat <strong>the</strong>y have very similar accuracy. That is why we decided to merge <strong>the</strong>m into acomplex pattern that combines <strong>the</strong> constraints using <strong>the</strong> or operator. We will refer tothis pattern as mIInne – see, for example, Table 4.1.4.2 Benefits of Handwritten Patterns for <strong>Wordnet</strong> ExpansionWe ran experiments on <strong>the</strong> extraction of hypernymic pairs on <strong>the</strong> same three corporaas those used for MSR extraction (Section 3.4.5): <strong>the</strong> IPI PAN Corpus [IPIC](≈ 254 million tokens) (Przepiórkowski, 2004), <strong>the</strong> Rzeczpospolita corpus [RzCorp](≈ 113 million tokens) (Rzeczpospolita, 2008), and a corpus of large texts in Polish<strong>from</strong> Internet (≈ 214 million tokens) [WebCorp]. Table 4.1 presents detailed resultsfor three patterns, JestInst, NomToNom and mIInne.We assessed <strong>the</strong> accuracy manually on randomly selected samples. Similarly too<strong>the</strong>r manual evaluations (for example, Section 3.4.5), we determined sample sizesfollowing <strong>the</strong> method discussed in (Israel, 1992), aiming for <strong>the</strong> 95% confidence levelon <strong>the</strong> whole population. We used a program named Sprawdzacz (Kurc, 2008) thatfacilitates manual evaluation of <strong>the</strong> extracted lexico-semantic relation instances 5 .5 We thank Roman Kurc for his great help with <strong>the</strong> whole plWordNet project.
106 Chapter 4. Extracting Relation Instancesand(in(cas[0],nom),llook(-1,begin,$T,equal(base[$T],{"taki"})),equal(base[$+1T],{"jak"}),only($+2T,-1,$AR,or(inter(flex[$AR],{adjective , adjectival participles , adverb ,adverbial participles , noun ,numeral }),in(orth[$AR],{"i","lub","czy","oraz","a",",",":","(",")"}))),llook($-1T,begin,$N,and(inter(flex[$N],{noun }),equal(base[$N],{"base form of NLU2"}),in(cas[$N],{nom,acc,dat,inst,loc,voc}))),only($+1N,$-1T,$AL,or(inter(flex[$AL],{adjective , adjectival participles , adverb ,adverbial participles , numeral }),and(inter(flex[$AL],{noun , pronouns }),equal(cas[$AL],{gen})))))Figure 4.2: The essentials of <strong>the</strong> TakichJak pattern implementation in JOSKIPIDuring <strong>the</strong> evaluation, an extracted LU pair could be classified as a correct instanceof hypernymy (possibly indirect, with longer paths accepted), or as one of two formsof nearly correct instances:• not <strong>the</strong> expected hyponym/hypernym order; such pairs occurred more oftenamong <strong>the</strong> results of <strong>the</strong> NomToNom pattern in which <strong>the</strong> direction is not markedby grammatical case;• small inaccuracies in one of <strong>the</strong> LUs: it is part of a larger multiword LU, or ithas a wrong number value, or it is represented by a wrong root (a tagger error).All o<strong>the</strong>r pairs were classified as incorrect. The results in Table 4.1 have been calculatedwith <strong>the</strong> assumption that correct and nearly correct instances are positive. If weexcluded <strong>the</strong> nearly correct class, <strong>the</strong> results would be about 20% lower. The resultswould be very low if we only sought direct hypernymy. This clearly suggests that <strong>the</strong>extracted pairs are not directly helpful in expanding <strong>the</strong> core plWordNet, but <strong>the</strong>y stillare a valuable source of knowledge. They show not only semantic similarity of <strong>the</strong>LUs in a pair, but also <strong>the</strong> direction of <strong>the</strong> relation. Indirect hypernyms can be helpful
- Page 1 and 2:
A Wordnetfrom the Ground Up
- Page 3 and 4:
Work financed by the Polish Ministr
- Page 7 and 8:
6 Prefaceheartfelt thanks go to all
- Page 9:
8 Chapter 1. Motivation, Goals, Ear
- Page 12 and 13:
1.1. Motivation 11[a] special form
- Page 14 and 15:
1.1. Motivation 13Affect (Strappara
- Page 16 and 17:
1.2. The Goals of the plWordNet Pro
- Page 18 and 19:
1.2. The Goals of the plWordNet Pro
- Page 20 and 21:
1.3. Early Decisions 19Merge Model:
- Page 22:
1.3. Early Decisions 214. On the ot
- Page 25 and 26:
24 Chapter 2. Building a Wordnet Co
- Page 27 and 28:
26 Chapter 2. Building a Wordnet Co
- Page 29 and 30:
28 Chapter 2. Building a Wordnet Co
- Page 31 and 32:
30 Chapter 2. Building a Wordnet Co
- Page 33 and 34:
32 Chapter 2. Building a Wordnet Co
- Page 35 and 36:
34 Chapter 2. Building a Wordnet Co
- Page 37 and 38:
36 Chapter 2. Building a Wordnet Co
- Page 39 and 40:
38 Chapter 2. Building a Wordnet Co
- Page 41 and 42:
40 Chapter 2. Building a Wordnet Co
- Page 43 and 44:
42 Chapter 2. Building a Wordnet Co
- Page 45 and 46:
44 Chapter 2. Building a Wordnet Co
- Page 47 and 48:
46 Chapter 2. Building a Wordnet Co
- Page 49 and 50:
48 Chapter 3. Discovering Semantic
- Page 51 and 52:
50 Chapter 3. Discovering Semantic
- Page 53 and 54:
52 Chapter 3. Discovering Semantic
- Page 55 and 56: 54 Chapter 3. Discovering Semantic
- Page 57 and 58: 56 Chapter 3. Discovering Semantic
- Page 59 and 60: 58 Chapter 3. Discovering Semantic
- Page 61 and 62: 60 Chapter 3. Discovering Semantic
- Page 63 and 64: 62 Chapter 3. Discovering Semantic
- Page 65 and 66: 64 Chapter 3. Discovering Semantic
- Page 67 and 68: 66 Chapter 3. Discovering Semantic
- Page 69 and 70: 68 Chapter 3. Discovering Semantic
- Page 71 and 72: 70 Chapter 3. Discovering Semantic
- Page 73 and 74: 72 Chapter 3. Discovering Semantic
- Page 75 and 76: 74 Chapter 3. Discovering Semantic
- Page 77 and 78: 76 Chapter 3. Discovering Semantic
- Page 79 and 80: 78 Chapter 3. Discovering Semantic
- Page 81 and 82: 80 Chapter 3. Discovering Semantic
- Page 83 and 84: 82 Chapter 3. Discovering Semantic
- Page 85 and 86: 84 Chapter 3. Discovering Semantic
- Page 87 and 88: 86 Chapter 3. Discovering Semantic
- Page 89 and 90: 88 Chapter 3. Discovering Semantic
- Page 91 and 92: 90 Chapter 3. Discovering Semantic
- Page 93 and 94: 92 Chapter 3. Discovering Semantic
- Page 95 and 96: 94 Chapter 3. Discovering Semantic
- Page 97 and 98: 96 Chapter 3. Discovering Semantic
- Page 99 and 100: 98 Chapter 3. Discovering Semantic
- Page 101 and 102: 100 Chapter 3. Discovering Semantic
- Page 103 and 104: 102 Chapter 4. Extracting Relation
- Page 105: 104 Chapter 4. Extracting Relation
- Page 109 and 110: 108 Chapter 4. Extracting Relation
- Page 111 and 112: 110 Chapter 4. Extracting Relation
- Page 113 and 114: 112 Chapter 4. Extracting Relation
- Page 115 and 116: 114 Chapter 4. Extracting Relation
- Page 117 and 118: 116 Chapter 4. Extracting Relation
- Page 119 and 120: 118 Chapter 4. Extracting Relation
- Page 121 and 122: 120 Chapter 4. Extracting Relation
- Page 123 and 124: 122 Chapter 4. Extracting Relation
- Page 125 and 126: 124 Chapter 4. Extracting Relation
- Page 127 and 128: 126 Chapter 4. Extracting Relation
- Page 129 and 130: 128 Chapter 4. Extracting Relation
- Page 131 and 132: 130 Chapter 4. Extracting Relation
- Page 133 and 134: 132 Chapter 4. Extracting Relation
- Page 135 and 136: 134 Chapter 4. Extracting Relation
- Page 137 and 138: 136 Chapter 4. Extracting Relation
- Page 139 and 140: 138 Chapter 4. Extracting Relation
- Page 141 and 142: 140 Chapter 4. Extracting Relation
- Page 143 and 144: 142 Chapter 4. Extracting Relation
- Page 145 and 146: 144 Chapter 4. Extracting Relation
- Page 147 and 148: 146 Chapter 4. Extracting Relation
- Page 149 and 150: 148 Chapter 4. Extracting Relation
- Page 151 and 152: 150 Chapter 4. Extracting Relation
- Page 153 and 154: 152 Chapter 4. Extracting Relation
- Page 155 and 156: 154 Chapter 4. Extracting Relation
- Page 157 and 158:
156 Chapter 4. Extracting Relation
- Page 159 and 160:
158 Chapter 4. Extracting Relation
- Page 161 and 162:
160 Chapter 4. Extracting Relation
- Page 163 and 164:
162 Chapter 4. Extracting Relation
- Page 165 and 166:
164 Chapter 4. Extracting Relation
- Page 167 and 168:
166 Chapter 5. Polish WordNet Today
- Page 169 and 170:
168 Chapter 5. Polish WordNet Today
- Page 171 and 172:
170 Chapter 5. Polish WordNet Today
- Page 173 and 174:
172 Chapter 5. Polish WordNet Today
- Page 175 and 176:
174 Chapter 5. Polish WordNet Today
- Page 177 and 178:
176 Chapter 5. Polish WordNet Today
- Page 179 and 180:
178 Chapter 5. Polish WordNet Today
- Page 181 and 182:
180 Chapter 5. Polish WordNet Today
- Page 183 and 184:
182 Chapter 5. Polish WordNet Today
- Page 186 and 187:
Appendix ATests for Lexico-semantic
- Page 188 and 189:
187Test for adjectives (T. IX)1. p1
- Page 190 and 191:
189RelatednessTest for nouns (T. XV
- Page 192 and 193:
BibliographyAgarwal, Abhaya and Alo
- Page 194 and 195:
Bibliography 193on Deep Lexical Acq
- Page 196 and 197:
Bibliography 195Derwojedowa, Magdal
- Page 198 and 199:
Bibliography 197Grefenstette, Grego
- Page 200 and 201:
Bibliography 199Kurc, Roman. (2008)
- Page 202 and 203:
Bibliography 201Mohammad, Saif and
- Page 204 and 205:
Bibliography 203. (2006) “The pot
- Page 206 and 207:
Bibliography 205and Technology 7(1-
- Page 208 and 209:
List of Tables2.1 The size of the c
- Page 210 and 211:
List of Figures2.1 The LU perspecti
- Page 212 and 213:
List of Figures 2114.16 Completely
- Page 214 and 215:
Index 213CBC, see Clustering by Com
- Page 216 and 217:
Index 215169, 177, 178, 180, 182hyp
- Page 218 and 219:
Index 217mutual hypernymy, 24Mutual
- Page 220 and 221:
Index 219SUMO, 14Supported Vector M
- Page 222:
A language without a wordnet is at