15.11.2013 Views

Análisis sintáctico conducido por un diccionario de patrones de ...

Análisis sintáctico conducido por un diccionario de patrones de ...

Análisis sintáctico conducido por un diccionario de patrones de ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

ABSTRACT<br />

Syntactic analysis of Spanish language has been following the same research<br />

path of syntactic analysis of English language. What this work intends is obtaining an<br />

a<strong>de</strong>quate mo<strong>de</strong>l for syntactic structure acquisition and structure disambiguation for<br />

Spanish language analyzing some of their diverse characteristics.<br />

In this thesis we review several formalisms chosen from the two main<br />

perspectives <strong>de</strong>veloped for syntactic analysis of natural languages: constituent<br />

grammars and <strong>de</strong>pen<strong>de</strong>ncy grammars. We analyze their subcategorization <strong>de</strong>scription<br />

and its relation to semantic roles or actants. We investigate the appropriate <strong>de</strong>scription<br />

of syntactic structures for Spanish language, a language with relaxed word or<strong>de</strong>r<br />

constrains, wi<strong>de</strong> prepositional phrase use, direct object differentiated by prepositional<br />

phrase realization, and duplication of syntactic valences among other characteristics.<br />

We argue for the specific <strong>de</strong>scription of each predicative word: verbs, adjectives and<br />

no<strong>un</strong>s, as <strong>de</strong>fined in <strong>de</strong>pen<strong>de</strong>ncy grammars .<br />

Nowadays it is not possible to reproduce the way human beings disambiguate<br />

word links in a phrase. We share the i<strong>de</strong>a that human beings employ different<br />

knowledge. Our syntactic structure acquisition mo<strong>de</strong>l consi<strong>de</strong>rs three types of<br />

knowledge: lexical, semantic and phrase structure. For syntactic ambiguity resolution<br />

we propose the classification of the output of the syntactic structure acquisition<br />

system composed of a module set. Each module is built based on a different method<br />

that represents a specific knowledge. Each module gives a set of weighted variants.<br />

Those weights are based on the satisfied characteristics in each method. So, each<br />

module gives a quantitative measure of the probability of each syntactic structure in a<br />

<strong>de</strong>pen<strong>de</strong>ncy structure format. To disambiguate syntactic structures a voting module<br />

uses the weights assigned in each module, voting for the maximum ad<strong>de</strong>d value of<br />

variants. The result is a classified list of the syntactic variants.<br />

The system inclu<strong>de</strong>s government patterns module, semantic proximity module<br />

and exten<strong>de</strong>d CFG module. The three methods require the compilation of dictionaries:<br />

the advanced government patterns dictionary, the semantic network and the exten<strong>de</strong>d<br />

Context Free Grammar (CFG) rules. The advanced government patterns refers to<br />

lexical knowledge, the <strong>de</strong>scription of the arguments for predicative words, similar to<br />

that in the government pattern dictionary of the Meaning ⇔ Text Theory, associated<br />

to semantic valences. We propose an updated <strong>de</strong>scription computer-a<strong>de</strong>quate and<br />

enriched with statistics of syntactic realization, statistics of diverse realization for the<br />

same valence and statistics of valences compatibility.<br />

The CFG rules refers to phrase structure knowledge based on constituents. We<br />

create an exten<strong>de</strong>d CFG for Spanish language (with gen<strong>de</strong>r and number concordance)<br />

and we implement a chart parser. We assume equally weighted variants for the CFG<br />

module. The semantic network refers to semantic knowledge. When several structures<br />

are quite possible or adj<strong>un</strong>cts attachment is ambiguous the semantic proximity, i.e.,<br />

the concepts more close related to the words in the possible constituents, could help<br />

to disambiguate structure variants. The i<strong>de</strong>a behind the semantic proximity is finding<br />

the shortest paths between constituents obtained from the CFG module. It employs a<br />

4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!