Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
8.4. Using <strong>Onto</strong>.<strong>PT</strong> 151<br />
In addition to <strong>the</strong> previous baseline, we had two <strong>of</strong>ficial runs where <strong>Onto</strong>.<strong>PT</strong><br />
was used to perform an additional expansion on <strong>the</strong> 67 topics containing verb<br />
phrases (VPs) with only one verb. There, <strong>the</strong> verbs were disambiguated, and <strong>the</strong>ir<br />
synonyms were used as search alternatives. The main idea behind this expansion was<br />
<strong>the</strong> improvement <strong>of</strong> <strong>the</strong> system’s recall. Still, only alternatives with more than 20<br />
occurrences in <strong>the</strong> <strong>the</strong> corpora provided by AC/DC were used. The only difference<br />
between <strong>the</strong>se two runs was that, in number 2, disambiguation was performed using<br />
<strong>the</strong> Bag-<strong>of</strong>-Words algorithm, while run number 3 used <strong>the</strong> Personalized PageRank.<br />
Moreover, after <strong>the</strong> <strong>of</strong>ficial evaluation, we sent additional un<strong>of</strong>ficial runs, where, besides<br />
o<strong>the</strong>r experiments, we had similar runs to 2 and 3, but this time, <strong>the</strong> category<br />
<strong>of</strong> all <strong>the</strong> topics was disambiguated and expanded as well.<br />
In order to illustrate how expansion worked, figure 8.6 presents <strong>the</strong> expansions<br />
<strong>of</strong> <strong>the</strong> category and <strong>the</strong> VP <strong>of</strong> <strong>the</strong> previously shown topics, obtained with <strong>the</strong> Personalized<br />
PageRank. For <strong>the</strong> sake <strong>of</strong> simplicity, we omitted <strong>the</strong> hypernymy pattern<br />
from <strong>the</strong> category expansion.<br />
Topic<br />
Original<br />
Category<br />
Expanded Original<br />
VP<br />
Expanded<br />
5 tribo grupo OR tribo habitavam habitar OR<br />
colonizar OR<br />
povoar OR<br />
ocupar<br />
6 viajantes ou viajante OR peregrino OR escreveram redigir OR<br />
exploradores viageiro OR passageiro OR<br />
escrever OR<br />
caminhante OR viandante OR<br />
explorador<br />
grafar<br />
7 sambistas sambador OR sambista abordam tratar OR<br />
apalavrar OR<br />
abordar OR<br />
versar<br />
Figure 8.6: Category and VP expansions in Rapportágico, using <strong>Onto</strong>.<strong>PT</strong>.<br />
Given <strong>the</strong> simplistic approach followed by Rapportágico and <strong>the</strong> high complexity<br />
<strong>of</strong> Págico, we can say that <strong>the</strong> obtained results were interesting. Rapportágico’s<br />
performance was below most <strong>of</strong> <strong>the</strong> human participants, but it was better than<br />
RENOIR (Cardoso, 2012), <strong>the</strong> o<strong>the</strong>r automatic participant. Never<strong>the</strong>less, RENOIR<br />
also followed a simplistic approach, and was heavily penalised by <strong>the</strong> large number<br />
<strong>of</strong> given answers per topic (100). The most relevant conclusions for our research was<br />
that <strong>the</strong> runs where VPs were expanded into <strong>the</strong>ir synonyms performed better than<br />
<strong>the</strong> baseline approach. Among <strong>the</strong>se two runs, Personalized PageRank performed<br />
better than <strong>the</strong> Bag-<strong>of</strong>-Words method.<br />
The results <strong>of</strong> <strong>the</strong> <strong>of</strong>ficial participation <strong>of</strong> Rapportágico in Págico are shown in<br />
table 8.7, for each run. In <strong>the</strong> same table, we present <strong>the</strong> results <strong>of</strong> <strong>the</strong> best human<br />
participation (actually, a groups <strong>of</strong> participants), ludIT (Veiga et al., 2012), which<br />
show that we are still very far from a human approach to this task, and we show <strong>the</strong><br />
results <strong>of</strong> <strong>the</strong> best run <strong>of</strong> RENOIR. Performance is given by <strong>the</strong> following measures:<br />
• Answered topics: number <strong>of</strong> topics with at least one given answer.<br />
• Given answers: total number <strong>of</strong> given answers.