PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 1. Introduction 4<br />
Chapter 3: Tracking English Inclusions in German describes an English inclusion<br />
classifier developed for mixed-lingual input text with German as the base language.<br />
It focuses initially on evaluation data preparation and annotation issues, subsequently<br />
providing a complete system description. The chapter also presents an evaluation <strong>of</strong><br />
the English inclusion classifier and its components, as well as its performance on two<br />
unseen datasets. The results show that the classifier performs well on new data in dif-<br />
ferent domains and compares well to another state-<strong>of</strong>-the-art mixed-lingual language<br />
identification approach. The penultimate section describes and discusses parameter<br />
tuning experiments conducted to determine the optimal settings for the classifier. Fi-<br />
nally, the English inclusion classifier is compared to a supervised machine learner.<br />
Chapter 4: System Extension to a New Language describes the adaptation <strong>of</strong> the<br />
classifier to process French text containing English inclusions. The aim <strong>of</strong> this chapter<br />
is to illustrate the ease with which the system can be adapted to deal with a new base<br />
language. The chapter first describes data preparation and then explains the work in-<br />
volved in extending various system modules. Finally, a detailed evaluation on unseen<br />
test data and a comparison <strong>of</strong> the classifier’s performance across languages is presented<br />
and discussed. The results show that the English inclusion classifier not only performs<br />
well on new data in different domains but also successfully fulfils its purpose in differ-<br />
ent language scenarios.<br />
Chapter 5: Parsing English Inclusions concentrates on applying the techniques de-<br />
veloped in the previous two chapters to a real-world task. This chapter presents a series<br />
<strong>of</strong> experiments on English inclusions and a set <strong>of</strong> random test suites using a treebank-<br />
induced and a hand-crafted rule-based German grammar parser. The aim here is to<br />
investigate the difficulty that state-<strong>of</strong>-the-art parsers have with sentences containing<br />
foreign inclusions, thereby determining the reasons for inaccuracy by means <strong>of</strong> error<br />
analysis and identifying appropriate ways <strong>of</strong> improving parsing performance. The ul-<br />
timate goal <strong>of</strong> this chapter is to highlight the <strong>of</strong>t-forgotten issue <strong>of</strong> English inclusions<br />
to researchers in the parsing community and motivate them to identify ways <strong>of</strong> dealing<br />
with inclusions by demonstrating the potential gains in parsing quality.