24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

98 Chapter 6. Thesaurus Enrichment<br />

6.2 Evaluation <strong>of</strong> <strong>the</strong> assignment procedure<br />

The evaluation <strong>of</strong> <strong>the</strong> assignment procedure has two main goals. First, it quantifies<br />

<strong>the</strong> performance <strong>of</strong> <strong>the</strong> assignment algorithm. Second, it enables <strong>the</strong> selection <strong>of</strong> <strong>the</strong><br />

most adequate settings, including <strong>the</strong> similarity measure and <strong>the</strong> best threshold σ<br />

to use in <strong>the</strong> integration <strong>of</strong> <strong>the</strong> synpairs <strong>of</strong> PAPEL/CARTÃO in TeP.<br />

6.2.1 The gold resource<br />

To compare <strong>the</strong> performance <strong>of</strong> <strong>the</strong> assignment algorithm using different settings, we<br />

randomly selected 355 noun synpairs <strong>of</strong> PAPEL 2.0 (Gonçalo Oliveira et al., 2010b)<br />

and had <strong>the</strong>m assigned, by two human annotators, to <strong>the</strong> synsets <strong>of</strong> TeP 2.0 (Maziero<br />

et al., 2008). Before <strong>the</strong>ir selection, we made sure that all 355 synpairs had at least<br />

one candidate synset in TeP. The manually assigned synpairs constitute a small gold<br />

collection, used to evaluate <strong>the</strong> procedure with different settings. Even though <strong>the</strong><br />

creation <strong>of</strong> this resource was a time-consuming task, we now have a reference that<br />

helps us understand <strong>the</strong> behavior <strong>of</strong> <strong>the</strong> algorithm. Fur<strong>the</strong>rmore, it is now possible<br />

to repeat this kind <strong>of</strong> evaluation as many times as needed.<br />

<strong>Lexical</strong>-semantic knowledge is typically subjective and thus hard to evaluate.<br />

Besides depending heavily on <strong>the</strong> vocabulary range and intuition <strong>of</strong> <strong>the</strong> human<br />

annotator, when it comes to <strong>the</strong> division <strong>of</strong> words into senses, even for expert lexicographers,<br />

<strong>the</strong>re is not a consensus because word senses are most <strong>of</strong> <strong>the</strong> time fuzzy<br />

and also because language evolves everyday (see section 4.3).<br />

In order to minimise this problem, both annotators manually selected <strong>the</strong> assigments<br />

for <strong>the</strong> same 355 synpairs. On average, <strong>the</strong>re were 4.31 candidate synsets for<br />

each synpair with a standard deviation <strong>of</strong> 3.27. Also on average, <strong>the</strong> first annotator<br />

assigned each synpair to 2.03±1.37 synsets, while, for <strong>the</strong> second, this number was<br />

2.64±2.30. Their matching assignments were 70% and <strong>the</strong>ir kappa agreement 0.43,<br />

which means fair/moderate agreement (Landis and Koch, 1977; Green, 1997) and<br />

shows, once again, how subjective it is to evaluate this kind <strong>of</strong> knowledge.<br />

6.2.2 Scoring <strong>the</strong> assignments<br />

In order to select <strong>the</strong> best assignment settings, we performed an extensive comparison<br />

<strong>of</strong> <strong>the</strong> assignment performance, using different similarity measures (introduced<br />

in section 6.1.2) and different thresholds σ. In all <strong>the</strong> experimentation runs,<br />

we used all <strong>the</strong> noun synpairs in CARTÃO, which includes PAPEL 3.0 and <strong>the</strong> synpairs<br />

extracted from Wiktionary.<strong>PT</strong> and DA, to establish <strong>the</strong> synonymy network<br />

for computing similarities. More about <strong>the</strong> size <strong>of</strong> this network and on its coverage<br />

by TeP can be found in section 6.4.1.<br />

The evaluation score <strong>of</strong> each setting was obtained using typical information retrieval<br />

measures, namely precision, recall and F -score. For a synpair in <strong>the</strong> set <strong>of</strong><br />

assigned synpairs, pi ∈ P , <strong>the</strong>se measures are computed as follows:<br />

P recisioni = |Selectedi ∩ Correcti|<br />

|Selectedi|<br />

P recision = 1<br />

|P |<br />

|P |<br />

<br />

P recisioni<br />

i=1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!