22.08.2013 Views

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

A generic framework for Arabic to English machine ... - Acsu Buffalo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.5 Technical challenges<br />

Figure 6.14: UniArab’s lexicon interface<br />

6.5. TECHNICAL CHALLENGES<br />

<strong>Arabic</strong> letters in the GUI We can not write <strong>Arabic</strong> letters in UniArab’s GUI. We use<br />

Unicode <strong>to</strong> represent them. Unicode Converter System allows us <strong>to</strong> enter <strong>Arabic</strong><br />

text and click on a but<strong>to</strong>n <strong>to</strong> get the equivalent Unicode of the text.<br />

<strong>Arabic</strong> letters in Eclipse IDE <strong>for</strong> Java We used Eclipse IDE <strong>for</strong> Java development. We<br />

can not write <strong>Arabic</strong> as a string in Eclipse. While Java does support <strong>Arabic</strong>, the<br />

problem lies in the operating system not supporting <strong>Arabic</strong> letter shapes in IDE. We<br />

used Windows XP and Windows 2000 which both have the same problem. To fix<br />

this we changed <strong>to</strong> Ubuntu Linux. Under Linux we can write <strong>Arabic</strong> text as a string<br />

in the Eclipse IDE.<br />

<strong>Arabic</strong> in data source We choose <strong>to</strong> create our data source as XML, <strong>for</strong> optimum sup-<br />

port or different plat<strong>for</strong>ms. It was also easier as we used <strong>Arabic</strong> letters not Unicode<br />

inside the data source. XML fully supports <strong>Arabic</strong>. We created our search engine<br />

using Java. We used a HashMap <strong>to</strong> make the keyword in <strong>Arabic</strong> when we search<br />

inside the datasource. We used verbMap.containsKey(word) in order <strong>to</strong> check<br />

the presence of an <strong>Arabic</strong> word in the data source.<br />

98

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!