22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

high-quality translation technology that is adequate <strong>for</strong> text-<strong>to</strong>-text translation. In this<br />

research we build an Interlingua architecture in MT which translates efficiently. We con-<br />

sider semantic analysis and other disambiguation related <strong>to</strong> <strong>Arabic</strong>. This research also<br />

represents a starting point <strong>for</strong> the future implementation of a successful and complete<br />

<strong>Arabic</strong> MT engine. The hypothesis under investigation and main aims are <strong>to</strong> present an<br />

interlingua architecture, which is not only successful in translating simplex <strong>Arabic</strong> (in-<br />

transitive, transitive, ditransitive and copula-like nominative) sentences <strong>to</strong> corresponding<br />

<strong>English</strong> sentences, but also does so in the most optimal way.<br />

This research is the first contribution (not just <strong>for</strong> <strong>Arabic</strong>) that uses the Role and Refer-<br />

ence Grammar (RRG) model as a basis <strong>for</strong> <strong>machine</strong> translation. This contribution shows<br />

how RRG can be used <strong>to</strong> deduce the logical structure of sentences and produce a lexical<br />

representation which can then be used as the interlingua bridge. The lexicon in RRG<br />

takes the position that lexical entries <strong>for</strong> verbs should contain unique in<strong>for</strong>mation only,<br />

with as much in<strong>for</strong>mation as possible derived from general lexical rules. This was the<br />

reason <strong>for</strong> creating our own lexicon since we need an RRG–based lexicon of the unique<br />

in<strong>for</strong>mation of verbs and their logical structure.<br />

UniArab stands <strong>for</strong> Universal <strong>Arabic</strong> <strong>machine</strong> transla<strong>to</strong>r system. The UniArab system<br />

is a natural language processing application based on Role and Reference Grammar <strong>for</strong><br />

translating the <strong>Arabic</strong> language in<strong>to</strong> any other language, using an RRG based interlingua<br />

bridge. The UniArab system can understand the part of speech of a word, agreement<br />

features, number, gender and the word type. The syntactic parse unpacks the agreement<br />

features between elements of the <strong>Arabic</strong> sentence in<strong>to</strong> a semantic representation (the log-<br />

ical structure) with the ‘state of affairs’ of the sentence. In the UniArab system we intend<br />

<strong>to</strong> have a strong analysis system that can unpack all in<strong>for</strong>mation and its attributes. This<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!