22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.1. CONCEPTUAL STRUCTURE OF THE UNIARAB SYSTEM<br />

has been implemented as a set of XML documents. The use of XML has the added<br />

advantage of portability. UniArab will effectively work the same regardless of the<br />

operating system. To understand the morphology of each word, we first <strong>to</strong>kenize<br />

each sentence and determine the word relationships. Phase 5 of the system holds all<br />

attributes specific <strong>to</strong> each word of the source sentence.<br />

Phase (6) Syntactic Parser Determines the precise phrasal structure and category of the<br />

<strong>Arabic</strong> sentence. At this point, the types and attributes of all words in the sentence<br />

are known.<br />

Phase (7) Syntactic linking (RRG) We must first develop the link from syntax <strong>to</strong> se-<br />

mantics out of the phrasal structure created in Phase 6, if we are <strong>to</strong> create a logical<br />

structure that will generate a target language and also act as the link in the opposite<br />

direction from semantics <strong>to</strong> syntax. The system should answer the main question in<br />

this phase, who does what <strong>to</strong> whom? We use the gender of the verb <strong>to</strong> determine<br />

the ac<strong>to</strong>r. When the subject and object have different genders, the gender of the verb<br />

must match the subject. If they both agree with the verb, then MSA dictates that<br />

the first noun is the subject. In this case the ac<strong>to</strong>r is Khalid and the undergoer is the<br />

book.<br />

Phase (8) Logical Structure Creation of logical structure is the most crucial phase. An<br />

accurate representation of the logical structure of an <strong>Arabic</strong> sentence is the primary<br />

strength of UniArab. Below is a sample output from the UniArab system. The<br />

<strong>Arabic</strong> equivalent of the past tense sentence ‘Khalid read the book’ <br />

<br />

qr֓a h ˘ āld ālktāb is input as the source.<br />

ālktāb book:N h ˘ āld Khalid:MsgN qr֓a read:V<br />

The results of the parse can be seen in the following logical structure:<br />

Verb read<br />

82

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!