22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5.2. DESIGNING AN XML LEXICON ARCHITECTURE FOR ARABIC MT BASED ON RRG<br />

the generation of any target language text or rather more correctly: any target language<br />

included in the system from the outset or planned <strong>for</strong> the future. In effect, this high<br />

degree of language-independence and objectivity means that interlinguas must strive <strong>to</strong>-<br />

wards universality in lexicon and structure: one might almost say, <strong>to</strong>wards representing<br />

the meaning of the text. Most interlingua-based systems use representations. The Chom-<br />

skyan theory of deep structures was thought <strong>to</strong> be attractive, but it is now agreed they<br />

are not sufficiently abstract, being <strong>to</strong>o oriented <strong>to</strong>wards the surface features of individ-<br />

ual languages. The implications of neutral structural representations can be illustrated<br />

by allowing <strong>for</strong> differences of word order between languages, and their significance. In<br />

<strong>English</strong>, word order is the primary means of distinguishing grammatical functions like<br />

subject and object. The <strong>Arabic</strong> language has a relatively free word order. The implica-<br />

tion <strong>for</strong> an interlingua is that it is not enough <strong>to</strong> designate word order on its own: the<br />

interlingua must represent the significance in terms of grammatical function (syntactic<br />

relations), text function, determination, case role or whatever else the interpretation of<br />

the word-order dictates. Structural differences can be treated in transfer-based systems<br />

by structural transfer rules. But in interlingua-based systems the representation must be<br />

language-neutral.<br />

5.2 Designing an XML lexicon architecture <strong>for</strong> <strong>Arabic</strong> MT based<br />

on RRG<br />

The lexicon in RRG takes the position that lexical entries <strong>for</strong> verbs should contain unique<br />

in<strong>for</strong>mation only, with as much in<strong>for</strong>mation as possible derived from general lexical<br />

rules. The lexicon is designed <strong>to</strong> reflect the word categories in the <strong>Arabic</strong> language with<br />

as much in<strong>for</strong>mation as possible derived from general lexical rules. The lexicon s<strong>to</strong>res<br />

the <strong>Arabic</strong> words in categories, each category is s<strong>to</strong>red in an XML <strong>for</strong>mat datasource<br />

69

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!