24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 2<br />

Background Knowledge<br />

The topic <strong>of</strong> Natural Language Processing (NLP), described extensively by Jurafsky<br />

and Martin (2009), is commonly presented with <strong>the</strong> help <strong>of</strong> pop-culture futuristic visions,<br />

where robots are capable <strong>of</strong> keeping a conversation with people, using human<br />

language. Those visions are typically impersonated by movie or television characters,<br />

such as HAL9000 in <strong>the</strong> Stanley Kubrick’s classic 2001: A Space Odyssey 1 , or<br />

Bender and o<strong>the</strong>r robots in Matt Groening’s Futurama 2 .<br />

NLP is a field <strong>of</strong> artificial intelligence (AI, Russell and Norvig (1995)) whose<br />

main purpose is to enable machines to understand <strong>the</strong> language <strong>of</strong> people and thus to<br />

communicate with us, in our own language, as if machines were a person <strong>the</strong>mselves.<br />

Given that natural language, used by humans for communication, is probably <strong>the</strong><br />

most natural way <strong>of</strong> encoding, transmitting and reasoning about knowledge, most<br />

knowledge repositories are in written form (Santos, 1992). Therefore, <strong>the</strong> emergence<br />

<strong>of</strong> <strong>the</strong> NLP field from AI is not surprising.<br />

One <strong>of</strong> <strong>the</strong> main problems concerning natural language is that it differs from<br />

formal languages (e.g. programming languages) because, in <strong>the</strong> latter, each symbol<br />

has only one meaning while, in <strong>the</strong> former, a symbol may have different meanings,<br />

depending on <strong>the</strong> context where it is used. Ambiguity occurs when it is not possible<br />

to assign a single meaning to a form <strong>of</strong> communication, because it can be interpreted<br />

in more than one way.<br />

In <strong>the</strong> work described in this <strong>the</strong>sis, several NLP techniques are applied in order<br />

to obtain language resources, structured in words, which can later be used in various<br />

NLP tasks. This chapter provides background knowledge that introduces two important<br />

topics for this <strong>the</strong>sis: lexical semantics, a subfield <strong>of</strong> NLP; and information<br />

extraction, a NLP task. The representation and organisation <strong>of</strong> lexical-semantic<br />

knowledge is also discussed, between <strong>the</strong> previous topics. In <strong>the</strong> end, we add some<br />

remarks in order to connect <strong>the</strong> described background knowledge with <strong>the</strong> work developed<br />

in <strong>the</strong> scope <strong>of</strong> this <strong>the</strong>sis. We decided to keep this chapter more <strong>the</strong>oretical,<br />

while <strong>the</strong> next chapter describes practical work, including existing lexical-semantic<br />

resources as well as works on information extraction from text.<br />

1 See http://www.imdb.com/title/tt0062622/ (August 2012)<br />

2 See http://www.imdb.com/title/tt0149460/ (August 2012)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!