24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

8.4. Using <strong>Onto</strong>.<strong>PT</strong> 151<br />

In addition to <strong>the</strong> previous baseline, we had two <strong>of</strong>ficial runs where <strong>Onto</strong>.<strong>PT</strong><br />

was used to perform an additional expansion on <strong>the</strong> 67 topics containing verb<br />

phrases (VPs) with only one verb. There, <strong>the</strong> verbs were disambiguated, and <strong>the</strong>ir<br />

synonyms were used as search alternatives. The main idea behind this expansion was<br />

<strong>the</strong> improvement <strong>of</strong> <strong>the</strong> system’s recall. Still, only alternatives with more than 20<br />

occurrences in <strong>the</strong> <strong>the</strong> corpora provided by AC/DC were used. The only difference<br />

between <strong>the</strong>se two runs was that, in number 2, disambiguation was performed using<br />

<strong>the</strong> Bag-<strong>of</strong>-Words algorithm, while run number 3 used <strong>the</strong> Personalized PageRank.<br />

Moreover, after <strong>the</strong> <strong>of</strong>ficial evaluation, we sent additional un<strong>of</strong>ficial runs, where, besides<br />

o<strong>the</strong>r experiments, we had similar runs to 2 and 3, but this time, <strong>the</strong> category<br />

<strong>of</strong> all <strong>the</strong> topics was disambiguated and expanded as well.<br />

In order to illustrate how expansion worked, figure 8.6 presents <strong>the</strong> expansions<br />

<strong>of</strong> <strong>the</strong> category and <strong>the</strong> VP <strong>of</strong> <strong>the</strong> previously shown topics, obtained with <strong>the</strong> Personalized<br />

PageRank. For <strong>the</strong> sake <strong>of</strong> simplicity, we omitted <strong>the</strong> hypernymy pattern<br />

from <strong>the</strong> category expansion.<br />

Topic<br />

Original<br />

Category<br />

Expanded Original<br />

VP<br />

Expanded<br />

5 tribo grupo OR tribo habitavam habitar OR<br />

colonizar OR<br />

povoar OR<br />

ocupar<br />

6 viajantes ou viajante OR peregrino OR escreveram redigir OR<br />

exploradores viageiro OR passageiro OR<br />

escrever OR<br />

caminhante OR viandante OR<br />

explorador<br />

grafar<br />

7 sambistas sambador OR sambista abordam tratar OR<br />

apalavrar OR<br />

abordar OR<br />

versar<br />

Figure 8.6: Category and VP expansions in Rapportágico, using <strong>Onto</strong>.<strong>PT</strong>.<br />

Given <strong>the</strong> simplistic approach followed by Rapportágico and <strong>the</strong> high complexity<br />

<strong>of</strong> Págico, we can say that <strong>the</strong> obtained results were interesting. Rapportágico’s<br />

performance was below most <strong>of</strong> <strong>the</strong> human participants, but it was better than<br />

RENOIR (Cardoso, 2012), <strong>the</strong> o<strong>the</strong>r automatic participant. Never<strong>the</strong>less, RENOIR<br />

also followed a simplistic approach, and was heavily penalised by <strong>the</strong> large number<br />

<strong>of</strong> given answers per topic (100). The most relevant conclusions for our research was<br />

that <strong>the</strong> runs where VPs were expanded into <strong>the</strong>ir synonyms performed better than<br />

<strong>the</strong> baseline approach. Among <strong>the</strong>se two runs, Personalized PageRank performed<br />

better than <strong>the</strong> Bag-<strong>of</strong>-Words method.<br />

The results <strong>of</strong> <strong>the</strong> <strong>of</strong>ficial participation <strong>of</strong> Rapportágico in Págico are shown in<br />

table 8.7, for each run. In <strong>the</strong> same table, we present <strong>the</strong> results <strong>of</strong> <strong>the</strong> best human<br />

participation (actually, a groups <strong>of</strong> participants), ludIT (Veiga et al., 2012), which<br />

show that we are still very far from a human approach to this task, and we show <strong>the</strong><br />

results <strong>of</strong> <strong>the</strong> best run <strong>of</strong> RENOIR. Performance is given by <strong>the</strong> following measures:<br />

• Answered topics: number <strong>of</strong> topics with at least one given answer.<br />

• Given answers: total number <strong>of</strong> given answers.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!