24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 8<br />

<strong>Onto</strong>.<strong>PT</strong>: a lexical ontology for<br />

Portuguese<br />

In <strong>the</strong> previous chapters, we have presented individual automatic steps towards<br />

<strong>the</strong> acquisition and integration <strong>of</strong> lexical-semantic knowledge in a LKB. Each step<br />

is implemented by a module and, if combined with <strong>the</strong> o<strong>the</strong>rs, as described by <strong>the</strong><br />

diagram in figure 8.1, results in <strong>the</strong> three step approach we propose to <strong>the</strong> automatic<br />

construction <strong>of</strong> a wordnet-like resource. This approach was named ECO, which<br />

stands for Extraction, Clustering and <strong>Onto</strong>logisation. Briefly, <strong>the</strong> ECO approach<br />

starts by extracting instances <strong>of</strong> semantic relations, represented as tb-triples, from<br />

textual sources. Then, synsets are discovered from <strong>the</strong> extracted synonymy tbtriples<br />

(synpairs). If <strong>the</strong>re is a an available synset-based <strong>the</strong>saurus, its synsets are<br />

first augmented, and new synsets are only discovered from <strong>the</strong> remaining synpairs.<br />

Finally, <strong>the</strong> term arguments <strong>of</strong> <strong>the</strong> non-synonymy tb-triples are ontologised, which<br />

means that <strong>the</strong>y are attached to <strong>the</strong> synsets, in <strong>the</strong> <strong>the</strong>saurus, that transmit suitable<br />

meanings and make <strong>the</strong> tb-triple true. This results in a wordnet, where synsets are<br />

connected by semantic relations (sb-triples).<br />

Figure 8.1: Diagram <strong>of</strong> <strong>the</strong> ECO approach for creating wordnets from text.<br />

Each module <strong>of</strong> ECO is completely independent <strong>of</strong> <strong>the</strong> o<strong>the</strong>rs and can be used<br />

alone, in order to achieve its specific task. For instance, given a set <strong>of</strong> synpairs,<br />

an existing <strong>the</strong>saurus may be enriched automatically and have its original synsets

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!