24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

60 Chapter 4. Acquisition <strong>of</strong> Semantic Relations<br />

1. Part <strong>of</strong> a grammar, with rules for extracting hypernymy (HIPERONIMO DE), part-<strong>of</strong>/haspart<br />

(PARTE DE/TEM PARTE), and purpose-<strong>of</strong> (FAZ SE COM) relations, and <strong>the</strong> definitions<br />

<strong>of</strong> an empty head (CABECA VAZIA):<br />

RAIZ ::= HIPERONIMO DE ...<br />

...<br />

RAIZ ::= CABECA VAZIA<br />

CABECA VAZIA ::= parte<br />

...<br />

RAIZ ::= ... usado para FAZ SE COM<br />

RAIZ ::= parte de TEM PARTE<br />

RAIZ ::= ... que contém DET PARTE DE<br />

2. Dictionary entries (definiendum, POS, definition) and relations extracted using <strong>the</strong><br />

previous rules:<br />

candeia nome utensílio doméstico rústico usado para iluminaç~ao, com<br />

pavio abastecido a óleo<br />

→ utensílio HIPERONIMO DE candeia<br />

→ com FAZ SE COM candeia<br />

→ iluminaç~ao FAZ SE COM candeia<br />

espiga nome parte das gramíneas que contém os gr~aos<br />

→ espiga PARTE DE gramíneas<br />

→ gr~aos PARTE DE espiga<br />

3. POS-tagging, cleaning and lemmatisation:<br />

candeia nome utensílio#n doméstico#adj rústico#adj usado#v-pcp<br />

para#prp iluminaç~ao#n ,#punc com#prp pavio#n<br />

abastecido#v-pcp a#prp óleo#n<br />

→ utensílio HIPERONIMO DE candeia<br />

→ iluminaç~ao FAZ SE COM candeia<br />

espiga nome parte#n de#prp as#art gramíneas#n que#pron-indp<br />

contém#v-fin os#art gr~aos#n<br />

→ espiga PARTE DE gramínea<br />

→ gr~ao PARTE DE espiga<br />

Figure 4.2: Extraction <strong>of</strong> semantic relations from dictionary definitions.<br />

4.2 A large lexical network for Portuguese<br />

The relation acquisition procedure was used to create CARTÃO (Gonçalo Oliveira<br />

et al., 2011), a large term-based lexical-semantic network for Portuguese, extracted<br />

from dictionaries. Regarding <strong>the</strong> incompleteness <strong>of</strong> dictionaries (Ide and Véronis,<br />

1995)), we exploited not one, but three electronic dictionaries <strong>of</strong> Portuguese, namely:<br />

• Dicionário PRO da Língua Portuguesa (DLP, 2005), indirectly with <strong>the</strong> results<br />

<strong>of</strong> <strong>the</strong> project PAPEL;<br />

• Dicionário Aberto (DA) (Simões and Farinha, 2011; Simões et al., 2012);<br />

• Wiktionary.<strong>PT</strong> 6 .<br />

6 Available from http://pt.wiktionary.org/ (September 2012)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!