24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

140 Chapter 8. <strong>Onto</strong>.<strong>PT</strong>: a lexical ontology for Portuguese<br />

Figure 8.4: <strong>Onto</strong>Busca, <strong>Onto</strong>.<strong>PT</strong>’s web interface.<br />

Wordnet Association, with an <strong>Onto</strong>.<strong>PT</strong> synset.<br />

8.3.1 Summary <strong>of</strong> evaluation so far<br />

Among <strong>the</strong> possible strategies to evaluate an ontology, a survey by<br />

Brank et al. (2005) presents four, which are probably <strong>the</strong> most commonly followed<br />

when it comes to domain ontologies:<br />

• Manual evaluation, performed by humans;<br />

• Comparison with an existing gold standard, eventually ano<strong>the</strong>r ontology;<br />

• Coverage evaluation, based on a dataset on <strong>the</strong> same domain;<br />

• Task-based evaluation, where <strong>the</strong> ontology is used by an application to achieve<br />

some task.<br />

Even though <strong>Onto</strong>.<strong>PT</strong> is not a domain ontology, we can say that, throughout this<br />

research, and depending on what we were evaluating, we have followed <strong>the</strong> first, <strong>the</strong><br />

second and <strong>the</strong> third approaches.<br />

First, in <strong>the</strong> extraction step (chapter 4), before performing <strong>the</strong> manual classification<br />

<strong>of</strong> some extractions, we evaluated <strong>the</strong> coverage <strong>of</strong> <strong>the</strong> extracted information<br />

by handcrafted <strong>the</strong>sauri, and by a newspaper corpus. Attention should be given to<br />

<strong>the</strong> coverage evaluation because, as <strong>Onto</strong>.<strong>PT</strong> is broad-coverage, it is not possible to<br />

find something like a corpus <strong>of</strong> its own domain. A language <strong>the</strong>saurus is probably<br />

<strong>the</strong> closest thing. Fur<strong>the</strong>rmore, <strong>the</strong> corpus was used to validate <strong>the</strong> coverage <strong>of</strong> <strong>the</strong><br />

relations, and was based on a limited set <strong>of</strong> discriminating patterns. When it comes<br />

to estimating <strong>the</strong> accuracy <strong>of</strong> <strong>the</strong> extracted relations, manual evaluation is probably

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!