24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

8 Chapter 1. Introduction<br />

Chapter 5 describes how synonymy networks extracted, for instance, from dictionaries,<br />

may be exploited in <strong>the</strong> discovery <strong>of</strong> synsets. To this end, a clustering<br />

procedure is ran on <strong>the</strong> previous networks and <strong>the</strong> discovered clusters are used as<br />

synsets. The clustering algorithm actually discovers fuzzy synsets, where a weight<br />

is associated to <strong>the</strong> membership <strong>of</strong> a word to a synset, in a more realistic representation<br />

<strong>of</strong> words and meanings. This part <strong>of</strong> <strong>the</strong> work led to <strong>the</strong> creation <strong>of</strong> <strong>the</strong><br />

Portuguese fuzzy <strong>the</strong>saurus, CLIP, completely extracted from dictionaries.<br />

Chapter 6 presents an approach for enriching <strong>the</strong> synsets <strong>of</strong> a <strong>the</strong>saurus<br />

with synonymy relations extracted from dictionaries. Given that <strong>the</strong>re are<br />

freely available synset-based resources for Portuguese, we decided to exploit <strong>the</strong>m in<br />

<strong>the</strong> creation <strong>of</strong> <strong>Onto</strong>.<strong>PT</strong>. Therefore, we used a public handcrafted <strong>the</strong>saurus as <strong>the</strong><br />

starting point for <strong>the</strong> creation <strong>of</strong> a broader synset-base. First, synonymy instances<br />

are assigned to <strong>the</strong> most similar synset. Then, <strong>the</strong> relations not assigned to a synset<br />

are <strong>the</strong> target <strong>of</strong> clustering, to discover new synsets, later added to <strong>the</strong> synsetbase.<br />

This part <strong>of</strong> <strong>the</strong> work originated TRIP, a large synset-based <strong>the</strong>saurus <strong>of</strong><br />

Portuguese.<br />

Chapter 7 proposes several algorithms for moving from term-based semantic<br />

relations to relations held between <strong>the</strong> synsets <strong>of</strong> a wordnet. After establishing<br />

<strong>the</strong> synset-base <strong>of</strong> <strong>Onto</strong>.<strong>PT</strong>, we still had a lexical network with semantic relations<br />

held between terms, and not synsets. Therefore, we developed and compared a set<br />

<strong>of</strong> algorithms that take advantage <strong>of</strong> <strong>the</strong> lexical network, and <strong>of</strong> <strong>the</strong> lexical items<br />

in <strong>the</strong> synsets, to select suitable synsets for attaching each argument <strong>of</strong> <strong>the</strong> lexical<br />

network’s relations. Given a synset-base and a lexical network, <strong>the</strong> result <strong>of</strong> <strong>the</strong>se<br />

algorithms is a wordnet.<br />

Chapter 8 summarises <strong>the</strong> work described in <strong>the</strong> previous chapters, which may<br />

be combined in ECO, <strong>the</strong> automatic approach for creating <strong>Onto</strong>.<strong>PT</strong>, a new lexical<br />

ontology for Portuguese. An overview <strong>of</strong> this resource is first presented, toge<strong>the</strong>r<br />

with some details on its availability, and on its evaluation. The last section suggests<br />

possible scenarios where <strong>Onto</strong>.<strong>PT</strong> might be useful.<br />

Chapter 9 presents a final discussion on this research and highlights its main contributions.<br />

In <strong>the</strong> end, some cues are given for fur<strong>the</strong>r improvements and additional<br />

work.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!