Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
140 Chapter 8. <strong>Onto</strong>.<strong>PT</strong>: a lexical ontology for Portuguese<br />
Figure 8.4: <strong>Onto</strong>Busca, <strong>Onto</strong>.<strong>PT</strong>’s web interface.<br />
Wordnet Association, with an <strong>Onto</strong>.<strong>PT</strong> synset.<br />
8.3.1 Summary <strong>of</strong> evaluation so far<br />
Among <strong>the</strong> possible strategies to evaluate an ontology, a survey by<br />
Brank et al. (2005) presents four, which are probably <strong>the</strong> most commonly followed<br />
when it comes to domain ontologies:<br />
• Manual evaluation, performed by humans;<br />
• Comparison with an existing gold standard, eventually ano<strong>the</strong>r ontology;<br />
• Coverage evaluation, based on a dataset on <strong>the</strong> same domain;<br />
• Task-based evaluation, where <strong>the</strong> ontology is used by an application to achieve<br />
some task.<br />
Even though <strong>Onto</strong>.<strong>PT</strong> is not a domain ontology, we can say that, throughout this<br />
research, and depending on what we were evaluating, we have followed <strong>the</strong> first, <strong>the</strong><br />
second and <strong>the</strong> third approaches.<br />
First, in <strong>the</strong> extraction step (chapter 4), before performing <strong>the</strong> manual classification<br />
<strong>of</strong> some extractions, we evaluated <strong>the</strong> coverage <strong>of</strong> <strong>the</strong> extracted information<br />
by handcrafted <strong>the</strong>sauri, and by a newspaper corpus. Attention should be given to<br />
<strong>the</strong> coverage evaluation because, as <strong>Onto</strong>.<strong>PT</strong> is broad-coverage, it is not possible to<br />
find something like a corpus <strong>of</strong> its own domain. A language <strong>the</strong>saurus is probably<br />
<strong>the</strong> closest thing. Fur<strong>the</strong>rmore, <strong>the</strong> corpus was used to validate <strong>the</strong> coverage <strong>of</strong> <strong>the</strong><br />
relations, and was based on a limited set <strong>of</strong> discriminating patterns. When it comes<br />
to estimating <strong>the</strong> accuracy <strong>of</strong> <strong>the</strong> extracted relations, manual evaluation is probably