22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.3. SYSTEM EVALUATION<br />

search engine <strong>to</strong> extract these extra modifiers. Hence, we need only a single entry <strong>for</strong> the<br />

basic noun, rather than an entry <strong>for</strong> each possible occurrence. This reduces the size of<br />

the lexicon, and hence the speed of the search routine.<br />

At the moment, the lexicon is categorised in<strong>to</strong> seven parts of speech. We have designed<br />

the GUI so that when adding a specific word <strong>to</strong> the lexicon, only the related options are<br />

presented <strong>to</strong> the user <strong>for</strong> that part of speech. This minimises errors when entering data.<br />

As our research extends, we may need <strong>to</strong> modify the categorisation of the lexicon <strong>to</strong> allow<br />

<strong>for</strong> more complicated word types.<br />

UniArab does not process ambiguous words or complex sentences, so far, in this research.<br />

This research focussed first on discovering whether the logical structure of a sentence,<br />

based on RRG can be used <strong>for</strong> translation. Hence, we decided <strong>to</strong> limit the scope of the<br />

project, since this is work in a new area, that has not been investigated be<strong>for</strong>e. We fully<br />

expect <strong>to</strong> expand the system <strong>to</strong> allow it <strong>to</strong> cope with ambiguity in the future. The system’s<br />

reliability depends on the data source and fails <strong>to</strong> handle unknown words. UniArab does<br />

not process single words, even if those words are in its lexicon, because UniArab is built<br />

on the logical structure of verbs.<br />

In our comparison with other translation systems we have used simplex sentences. While<br />

UniArab is limited <strong>to</strong> simplex sentences and has limited coverage, we believe it is essen-<br />

tial <strong>to</strong> reach high quality translation of these sentences first, in order <strong>to</strong> be able <strong>to</strong> expand<br />

<strong>to</strong> high quality translations of more complex sentences. We can see that the existing <strong>to</strong>ols<br />

cannot even achieve reasonable translations of simplex sentences, so how can we expect<br />

them <strong>to</strong> give high quality translations of larger text? We have found that small errors in<br />

the initial analysis of a sentence can cause huge errors in the final translation, so high<br />

quality analysis is very important.<br />

127

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!