22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.2. UNIARAB: LEXICAL REPRESENTATION IN INTERLINGUA SYSTEM BASED ON RRG<br />

6.2 UniArab: Lexical representation in interlingua system based<br />

on RRG<br />

Lexical frames represent the language-dependent lexicon. We use an XML data source<br />

<strong>to</strong> represent the UniArab lexicon. The lexicon creates pointers <strong>to</strong> corresponding con-<br />

ceptual frames or attributes of each word. These frames also have relations which link<br />

them <strong>to</strong> verb class frames, which are organized hierarchically according <strong>to</strong> the particular<br />

language.<br />

Although we adhere <strong>to</strong> the Interlingua approach, we do not do so with the translation<br />

of lexical items. In an ideal Interlingua system lexical entries should be broken down<br />

in<strong>to</strong> sets of semantic features. For example the word “man” is broken down in<strong>to</strong> +human<br />

+male +adult. While this works in theory, in practice we cannot find enough seman-<br />

tic features <strong>to</strong> describe every entity in the world. For example “cow”, “computer” and<br />

“chair” cannot be described using these sets of semantic features unless we invent a<br />

unique semantic feature <strong>for</strong> every object and this is practically impossible.<br />

6.2.1 Verb<br />

In the UniArab system, we capture the in<strong>for</strong>mation shown in Figure 6.4 <strong>for</strong> each verb.<br />

The verb in<strong>for</strong>mation captured consists of <strong>Arabic</strong> Verb, <strong>English</strong> Translation, Logical<br />

Structure, Tense, Gender, Person and Number. The <strong>Arabic</strong> Verb represents one of the<br />

<strong>Arabic</strong> verbs in a specific tense, <strong>for</strong> a specific gender, person and number. The <strong>English</strong><br />

translation is the <strong>English</strong> equivalent of the <strong>Arabic</strong> verb. The Logical Structure attribute is<br />

the RRG equivalent logical structure or lexical entry representation <strong>for</strong> the <strong>Arabic</strong> Verb.<br />

<strong>Arabic</strong> inflects verbs <strong>for</strong> tense and they agree in person, number and gender with the<br />

subject. In RRG, Tense is a verbal opera<strong>to</strong>r in the layer structure of the clause providing<br />

86

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!