24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.3. Discussion 77<br />

Figure 4.5: <strong>Lexical</strong> network where ambiguity arises.<br />

Given that <strong>the</strong>y share <strong>the</strong> same structure, <strong>the</strong> utility <strong>of</strong> a resource as CARTÃO is supported by <strong>the</strong> number <strong>of</strong> works using PAPEL. So far, PAPEL has been used<br />

as a gold standard for computing similarity between lexical items (Sarmento, 2010),<br />

in <strong>the</strong> adaptation <strong>of</strong> textual contents for poor literacy readers (Amancio et al.,<br />

2010), in <strong>the</strong> automatic generation <strong>of</strong> distractors for cloze questions (Correia et al.,<br />

2010), as a knowledge base for QA (Saias, 2010; Rodrigues et al., 2011) and question<br />

generation (Marques, 2011) systems, to validate terms describing places (Oliveira<br />

Santos et al., 2012), and in <strong>the</strong> enrichment (Silva et al., 2012b) and creation (Paulo-<br />

Santos et al., 2012) <strong>of</strong> sentiment lexicons. CARTÃO has already been used in <strong>the</strong><br />

automatic generation <strong>of</strong> poetry (Gonçalo Oliveira, 2012).<br />

On <strong>the</strong> o<strong>the</strong>r hand, lexical resources based on words, identified by <strong>the</strong>ir orthographical<br />

form, are not practical for several computational applications. This<br />

happens because words have different senses that go from tightly related, as in<br />

polysemy (e.g. bank, institution and building) or metonymy (e.g. bank, <strong>the</strong> building<br />

or its employers), to completely different, as in homonymy (e.g. bank, institution<br />

or slope). Moreover, <strong>the</strong>re are words with completely different orthographical forms<br />

denoting <strong>the</strong> same concept (e.g. car and automobile).<br />

Ambiguities may lead to serious inconsistencies in tasks where handling word<br />

senses is critical, as in inference. In figure 4.5, we present an example <strong>of</strong> a termbased<br />

lexical network with several ambiguous Portuguese words, namely:<br />

• pasta, which might refer to a briefcase, paste, pasta or money (figuratively);<br />

• massa, which might refer to pasta, to people or money (both figuratively);<br />

• pastel might be a cake or money (figuratively);<br />

• cacau might refer to cocoa (a fruit) or to money (also figuratively).<br />

It is not hard to imagine that, if <strong>the</strong>se ambiguities are not handled, erroneous<br />

inferences can be made, such as:<br />

• massa synonym-<strong>of</strong> povo ∧ massa hypernym-<strong>of</strong> tortellini<br />

→ povo hypernym-<strong>of</strong> tortellini (people hypernym-<strong>of</strong> tortellini)<br />

• dinheiro synonym-<strong>of</strong> cacau ∧ fruto hypernym-<strong>of</strong> cacau<br />

→ fruto hypernym-<strong>of</strong> dinheiro (fruit hypernym-<strong>of</strong> money)<br />

A real example <strong>of</strong> <strong>the</strong>se problems is presented in Gonçalo Oliveira et al. (2010b),

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!