Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

More documents

Recommendations

Info

Chapter 6 Thesaurus Enrichment General language dictionaries and language thesauri cover the same kind of knowledge, but represent it differently. While the former consist of lists of word senses and respective natural language sense descriptions, the latter group synonymous words together, so that they can be seen as possible lexicalisations of concepts. WordNet (Fellbaum, 1998) can actually be seen as a resource that bridges the gap between both kinds of resources, because each synset contains a textual gloss. However, in previous chapters, we have shown that, even though they intend to cover the same kind of knowledge, most of the information in public handcrafted Portuguese thesaurus is complementary to the information extracted from dictionaries. Therefore, it should be more fruitful to integrate their information in Onto.PT instead of using them merely as a reference for comparison. Another aspect in favour of this option is that, besides its size, TeP was manually created by experts. This means that, more than integrating the information in TeP, we can take advantage of its structure to have more reliable synsets and more controlled sense granularity. The work presented in this chapter can be seen both as an alternative or a complement of the previous chapter, as we use the synsets of TeP as a starting point for the construction of a broader thesaurus. To this end, we follow a fourstep approach for enriching an existing electronic thesaurus, structured in synsets, with information extracted from electronic dictionaries, represented as synonymy pairs (synpairs) 1 : 1. Extraction of synpairs from dictionary definitions; 2. Assignment of synpairs to suitable synsets of the thesaurus; 3. Discovery of new synsets after clustering the remaining synpairs; 4. Integration of the new synsets in the thesaurus. In step 1, any approach for the automatic acquisition of synpairs from dictionaries, such as the one described in chapter 4, may be followed. Therefore, we will not go further on this step. We start this chapter by presenting its main contribution, which is the algorithm for the automatic assignment of synpairs to synsets. Then, we evaluate the algorithm against a gold standard and select the most adequate settings for using it in the enrichment of TeP. Any graph clustering procedure suits step 3 of our approach. We chose to follow an approach similar to the one introduced 1 Synpairs are synonymy tb-triples. They can be extracted from several sources, however, as we are dealing with general language knowledge, dictionaries are the obvious targets.
Page 1:
PhD Thesis Doctoral Program in Info
Page 5:
Preface About six years ago, almost
Page 9 and 10:
Resumo Não há grandes dúvidas qu
Page 11 and 12:
Contents Chapter 1: Introduction .
Page 13:
8.2.1 Semantic Web model . . . . .
Page 16 and 17:
6.1 Illustrative synonymy network.
Page 18 and 19:
6.3 Evaluation against intersection
Page 21 and 22:
Chapter 1 Introduction A substantia
Page 23 and 24:
1.2. Approach 5 • They are not bu
Page 25 and 26:
1.4. Outline of the thesis 7 which
Page 27 and 28:
Chapter 2 Background Knowledge The
Page 29 and 30:
2.1. Lexical Semantics 11 that, in
Page 31 and 32:
2.1. Lexical Semantics 13 Meronymy
Page 33 and 34:
2.2. Lexical Knowledge Formalisms a
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
2.3. Information Extraction from Te
Page 41 and 42:
2.3. Information Extraction from Te
Page 43:
2.4. Remarks on this section 25 usi
Page 46 and 47:
28 Chapter 3. Related Work in group
Page 48 and 49:
30 Chapter 3. Related Work ple rela
Page 50 and 51:
32 Chapter 3. Related Work knowledg
Page 52 and 53:
34 Chapter 3. Related Work the ELRA
Page 54 and 55:
36 Chapter 3. Related Work resource
Page 56 and 57:
38 Chapter 3. Related Work English
Page 58 and 59:
40 Chapter 3. Related Work of super
Page 60 and 61:
42 Chapter 3. Related Work • part
Page 62 and 63: 44 Chapter 3. Related Work LSIE fro
Page 64 and 65: 46 Chapter 3. Related Work modifier
Page 66 and 67: 48 Chapter 3. Related Work 6. {,}
Page 68 and 69: 50 Chapter 3. Related Work 1. Extra
Page 70 and 71: 52 Chapter 3. Related Work Due to t
Page 72 and 73: 54 Chapter 3. Related Work comparis
Page 74 and 75: 56 Chapter 3. Related Work creation
Page 76 and 77: 58 Chapter 4. Acquisition of Semant
Page 98 and 99: 80 Chapter 5. Synset Discovery Ther
Page 100 and 101: 82 Chapter 5. Synset Discovery the
Page 102 and 103: 84 Chapter 5. Synset Discovery tb-t
Page 104 and 105: 86 Chapter 5. Synset Discovery cota
Page 106 and 107: 88 Chapter 5. Synset Discovery θ W
Page 108 and 109: 90 Chapter 5. Synset Discovery Tabl
Page 110 and 111: 92 Chapter 5. Synset Discovery word
Page 114 and 115: 96 Chapter 6. Thesaurus Enrichment
Page 131 and 132: Chapter 7 Moving from term-based to
Page 133 and 134: 7.1. Ontologising algorithms 115 Ea
Page 135 and 136: 7.1. Ontologising algorithms 117 Ad
Page 137 and 138: 7.2. Ontologising performance 119 F
Page 139 and 140: 7.2. Ontologising performance 121
Page 141 and 142: 7.2. Ontologising performance 123 T
Page 143 and 144: 7.2. Ontologising performance 125 A
Page 145 and 146: 7.2. Ontologising performance 127 %
Page 147: 7.3. Discussion 129 • The gold re
Page 150 and 151: 132 Chapter 8. Onto.PT: a lexical o
Page 162 and 163:
144 Chapter 8. Onto.PT: a lexical o
Page 164 and 165:
Page 166 and 167:
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 175 and 176:
Chapter 9 Final discussion The rese
Page 177 and 178:
9.1. Contributions 159 - Anton Pér
Page 179 and 180:
9.2. Future work 161 more than cues
Page 181:
9.3. Concluding remarks 163 reform
Page 184 and 185:
166 References Banko, M., Cafarella
Page 186 and 187:
168 References Clark, P., Fellbaum,
Page 188 and 189:
170 References Gale, W. A., Church,
Page 190 and 191:
172 References EACL 2012, pages 580
Page 192 and 193:
174 References Levin, B. (1993). En
Page 194 and 195:
176 References Navigli, R. (2009a).
Page 196 and 197:
178 References Language Resource an
Page 198 and 199:
180 References Shi, L. and Mihalcea
Page 200 and 201:
182 References volume 85 of CRPIT,
Page 202 and 203:
184 Appendix A. Description of the
Page 204 and 205:
186 Appendix A. Description of the
Page 207 and 208:
Appendix B Coverage of EuroWordNet
Page 209 and 210:
Table B.1 - continued from previous
Page 211 and 212:
Table B.2 - continued from previous
show all

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Create successful ePaper yourself

Delete template?

Save as template?