A generic framework for Arabic to English machine ... - Acsu Buffalo
A generic framework for Arabic to English machine ... - Acsu Buffalo
A generic framework for Arabic to English machine ... - Acsu Buffalo
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4.4.1 Non-Roman alphabet scripts<br />
4.4. LINGUISTIC ASPECTS OF MT<br />
Since computer technology developed mostly in <strong>English</strong>, other languages, particularly<br />
those with non-Roman alphabet have his<strong>to</strong>rically been seen as a special case and required<br />
new code sets <strong>to</strong> define character representations. Furthermore, not all languages with<br />
alphabetic scripts are written left-<strong>to</strong>-right, e.g. <strong>Arabic</strong> and Hebrew, so any input/output<br />
devices making this assumption will be useless <strong>for</strong> such languages. Be<strong>for</strong>e Unicode was<br />
standardised, there were different encoding systems <strong>for</strong> assigning this problem. Unicode<br />
provides a unique code <strong>for</strong> every character, no matter what the plat<strong>for</strong>m, the program and<br />
the language are. Appendix A provides the corresponding Unicode <strong>for</strong> each <strong>Arabic</strong> letter<br />
and describes the letters with their corresponding written shapes.<br />
4.4.2 Lexical ambiguity<br />
Category ambiguities or homographs are examples of lexical ambiguities which arise<br />
when there are potentially two or more ways in which a word can be analysed. More<br />
complex are lexical ambiguities, where one word can be interpreted in more than one<br />
way. Lexical ambiguities are of three basic types: category ambiguities, homographs and<br />
transfer (or translational) ambiguities.<br />
4.4.2.1 Category ambiguity<br />
The simplest type of lexical ambiguity is that of category ambiguity: a given word could<br />
be assigned <strong>to</strong> more than one grammatical or syntactic category (e.g. noun, verb or<br />
adjective) according <strong>to</strong> the context. There are several examples of this in <strong>English</strong>: light<br />
can be a noun, verb or adjective, also, control can be a noun or verb. In <strong>Arabic</strong> there<br />
are some words that can be in more than one category, <strong>for</strong> example ֒lā could be a<br />
preposition with meaning of “on”, or a verb with meaning of “raise”.<br />
57