PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 6. Other Potential Applications 167<br />
GERMAN: Mit diesem Tool können Sie parallel in sämtlichen News<br />
suchen.<br />
ENGLISH(BABELFISH): With this Tool you can search parallel in all news.<br />
FRENCH(BABELFISH): Avec ce Tool, vous pouvez chercher parallèlement tous les<br />
News dans.<br />
While it was established in Sections 2.1.3.2 and 4.2 that French contains a large<br />
number <strong>of</strong> anglicisms at least in the domain <strong>of</strong> IT, the use <strong>of</strong> such anglicisms may,<br />
however, not always be the preferred choice by a human translator (HT). They may<br />
produce the following French translation <strong>of</strong> the German sentence:<br />
FRENCH(HT): Avec cet outil, vous pouvez chercher tous les actualités en parallèle.<br />
Mixed-lingual compounds or interlingual homographs are even more <strong>of</strong> a challenge<br />
to MT systems. One very interesting example occurs in the following German query: 10<br />
GERMAN: Nenne einen Grund für Selbstmord bei Teenagern.<br />
ENGLISH(BABELFISH): Call a reason for suicide with dte rodents.<br />
FRENCH(BABELFISH): cite une raison de suicide avec des Teenagern.<br />
The English inclusion Teenager appears in the dative plural and consequently re-<br />
ceives the German inflection n. Instead <strong>of</strong> treating this noun as an English inclusion<br />
when translating the sentence into English, Babelfish processes this token as the Ger-<br />
man compound Tee+Nagern (tea + rodents) and translates its subparts into the token<br />
dte 11 and the noun rodents. Translating into French, the MT system treats the English<br />
inclusion as unseen and inserts it directly into the translation without further process-<br />
ing, such as inflection removal. Combined with a named entity recogniser, the English<br />
inclusion classifier could signal to the MT engine which items require either translating<br />
or transferring with respect to the target language.<br />
Multi-word English inclusions also pose difficulty to most MT systems. If the<br />
system has not encountered a particular expression in its training data or its lexicon,<br />
it is likely to treat the entire expression as unseen. However, MT systems are not<br />
necessarily aware <strong>of</strong> the boundaries <strong>of</strong> such multi-word expression as illustrated in<br />
10This query appeared in CLEF 2004 where one <strong>of</strong> the tasks was to find answers to German question<br />
in English documents (Ahn et al., 2004).<br />
11Note that dte is not a typo but an error in the MT output.