PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 6. Other Potential Applications 173<br />
clear strategy for making this decision. He argues that non-adapted anglicisms should<br />
be entered into general dictionaries or lexicons if they occur above a certain frequency<br />
in a large and balanced corpus. The English inclusion classifier would be a useful<br />
tool in this context. It could be constantly run over new documents, thereby allowing<br />
lexicographers to identify new loan words, possibly even trace them, and determine<br />
the frequency <strong>of</strong> a certain loan word over time. The English inclusion classifier can<br />
consequently make lexicographers aware <strong>of</strong> a language mixing phenomenon that they<br />
might otherwise miss during their corpus analysis. Equally, lexicographers could feed<br />
their knowledge back into the classifier as a way <strong>of</strong> improving its performance. In<br />
this way, the classifier would allow lexicographers to base their decisions to include<br />
a term in the dictionary based on empirical facts, and, conversely, the lexicographers’<br />
knowledge could be exploited to increase the performance <strong>of</strong> the classifier.<br />
6.4 Chapter Summary<br />
This chapter described in detail the usefulness <strong>of</strong> English inclusion detection for var-<br />
ious applications and fields, including TTS, MT and linguistics and lexicography. As<br />
with parsing, input to TTS and MT systems is generally assumed to be monolingual<br />
and so far there has been little focus on devising systems that are able to process mixed-<br />
lingual input sentences. In our increasingly globalised world where English is infiltrat-<br />
ing many other languages, automatic natural language processing must be able to deal<br />
with such language mixing. The English inclusion classifier could be used in a pre-<br />
processing stage in order to signal where language changes occur. Further processing<br />
<strong>of</strong> English inclusions then depends on various syn<strong>thesis</strong> or translation strategies for<br />
specific cases. This chapter reviewed previous work on deriving such strategies and<br />
presented some ideas for future work in terms <strong>of</strong> extrinsically evaluating the benefit <strong>of</strong><br />
English inclusion detection for both applications.<br />
Regarding the fields <strong>of</strong> linguistics and lexicography, this chapter summarised the<br />
benefits <strong>of</strong> the English inclusion classifier as a tool for automating synchronic and di-<br />
achronic language analysis. As such, the classifier could be beneficial to linguists who<br />
examine the frequency <strong>of</strong> certain expressions at a given point in time, or in different do-<br />
mains, and who track language changes over time. Moreover, it could be used to assist<br />
lexicographers in their decisions to include specific terms into lexicons or dictionaries.