Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
42 Chapter 3. Related Work<br />
• part, n: a piece <strong>of</strong> something;<br />
• piece, n: a portion <strong>of</strong> some material;<br />
Amsler (1981) believes that <strong>the</strong>se loops are usually <strong>the</strong> evidence <strong>of</strong> a truly primitive<br />
concept, such as <strong>the</strong> set containing <strong>the</strong> words class, group, type, kind, set,<br />
division, category, species, individual, grouping, part and section. These primitives<br />
are <strong>of</strong>ten related with “covert categories” (Ide and Véronis, 1995), which are concepts<br />
that do not correspond to any particular word and are introduced to represent<br />
a specific category or group <strong>of</strong> concepts. For instance, <strong>the</strong>re is no word to describe<br />
<strong>the</strong> hypernym <strong>of</strong> <strong>the</strong> concepts described by tool, utensil, implement and instrument,<br />
so a new “covert” hypernym, instrumental-object, is artificially created.<br />
Chodorow et al. (1985) introduced <strong>the</strong> notion <strong>of</strong> “empty heads”. Words belonging<br />
to this small class (e.g. one, any, kind, class, manner, family, race, group,<br />
complex) might occur in <strong>the</strong> beginning <strong>of</strong> <strong>the</strong> definition followed by <strong>the</strong> preposition<br />
<strong>of</strong>, but do not represent <strong>the</strong> superordinate concept. Guthrie et al. (1990) explored<br />
<strong>the</strong> class <strong>of</strong> “empty heads” to extract o<strong>the</strong>r semantic relations, besides hyponymy.<br />
For instance, <strong>the</strong> word member is related with <strong>the</strong> member-set relation (Markowitz<br />
et al., 1986) and <strong>the</strong> word part is related with <strong>the</strong> is-part relation (included by Amsler<br />
(1981) in his tangled hierarchies). Concerning this problem, Nakamura and Nagao<br />
(1988) provide a list <strong>of</strong> function nouns that appear in <strong>the</strong> beginning <strong>of</strong> dictionary<br />
definitions, and <strong>the</strong> relations <strong>the</strong>y are usually associated with:<br />
• kind, type → is-a<br />
• part, side, top → part-<strong>of</strong><br />
• set, member, group, class, family → membership<br />
• act, way, action → action<br />
• state, condition → state<br />
• amount, sum, measure → amount<br />
• degree, quality → degree<br />
• form, shape → form<br />
Ano<strong>the</strong>r typical issue is <strong>the</strong> disambiguation <strong>of</strong> <strong>the</strong> genus, which consists<br />
on matching words that appear in <strong>the</strong> definition with <strong>the</strong>ir correct sense in <strong>the</strong><br />
dictionary. In Amsler (1981) and Chodorow et al. (1985), this task requires human<br />
intervention. Some years later, Bruce and Guthrie (1992) worked on an automatic<br />
procedure to accomplish genus disambiguation. First, <strong>the</strong>y identify <strong>the</strong> genus <strong>of</strong> <strong>the</strong><br />
definition. Then, <strong>the</strong>y exploit category markups (e.g. plant, solid) and frequency<br />
information to disambiguate <strong>the</strong> genus with 80% accuracy.<br />
More recently, Navigli (2009a) presented an algorithm to disambiguate words in<br />
dictionary definitions. Their approach is based on <strong>the</strong> exploitation <strong>of</strong> circularity in<br />
dictionaries.<br />
Electronic dictionaries are certainly an important source <strong>of</strong> lexical-semantic<br />
knowledge, but <strong>the</strong>ir organisation does not favour <strong>the</strong>ir direct use as NLP tools,<br />
since <strong>the</strong>y were made to be read by humans. Wilks et al. (1988) mention several