Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 2<br />
Background Knowledge<br />
The topic <strong>of</strong> Natural Language Processing (NLP), described extensively by Jurafsky<br />
and Martin (2009), is commonly presented with <strong>the</strong> help <strong>of</strong> pop-culture futuristic visions,<br />
where robots are capable <strong>of</strong> keeping a conversation with people, using human<br />
language. Those visions are typically impersonated by movie or television characters,<br />
such as HAL9000 in <strong>the</strong> Stanley Kubrick’s classic 2001: A Space Odyssey 1 , or<br />
Bender and o<strong>the</strong>r robots in Matt Groening’s Futurama 2 .<br />
NLP is a field <strong>of</strong> artificial intelligence (AI, Russell and Norvig (1995)) whose<br />
main purpose is to enable machines to understand <strong>the</strong> language <strong>of</strong> people and thus to<br />
communicate with us, in our own language, as if machines were a person <strong>the</strong>mselves.<br />
Given that natural language, used by humans for communication, is probably <strong>the</strong><br />
most natural way <strong>of</strong> encoding, transmitting and reasoning about knowledge, most<br />
knowledge repositories are in written form (Santos, 1992). Therefore, <strong>the</strong> emergence<br />
<strong>of</strong> <strong>the</strong> NLP field from AI is not surprising.<br />
One <strong>of</strong> <strong>the</strong> main problems concerning natural language is that it differs from<br />
formal languages (e.g. programming languages) because, in <strong>the</strong> latter, each symbol<br />
has only one meaning while, in <strong>the</strong> former, a symbol may have different meanings,<br />
depending on <strong>the</strong> context where it is used. Ambiguity occurs when it is not possible<br />
to assign a single meaning to a form <strong>of</strong> communication, because it can be interpreted<br />
in more than one way.<br />
In <strong>the</strong> work described in this <strong>the</strong>sis, several NLP techniques are applied in order<br />
to obtain language resources, structured in words, which can later be used in various<br />
NLP tasks. This chapter provides background knowledge that introduces two important<br />
topics for this <strong>the</strong>sis: lexical semantics, a subfield <strong>of</strong> NLP; and information<br />
extraction, a NLP task. The representation and organisation <strong>of</strong> lexical-semantic<br />
knowledge is also discussed, between <strong>the</strong> previous topics. In <strong>the</strong> end, we add some<br />
remarks in order to connect <strong>the</strong> described background knowledge with <strong>the</strong> work developed<br />
in <strong>the</strong> scope <strong>of</strong> this <strong>the</strong>sis. We decided to keep this chapter more <strong>the</strong>oretical,<br />
while <strong>the</strong> next chapter describes practical work, including existing lexical-semantic<br />
resources as well as works on information extraction from text.<br />
1 See http://www.imdb.com/title/tt0062622/ (August 2012)<br />
2 See http://www.imdb.com/title/tt0149460/ (August 2012)