05.03.2013 Views

PhD thesis - School of Informatics - University of Edinburgh

PhD thesis - School of Informatics - University of Edinburgh

PhD thesis - School of Informatics - University of Edinburgh

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 1. Introduction 4<br />

Chapter 3: Tracking English Inclusions in German describes an English inclusion<br />

classifier developed for mixed-lingual input text with German as the base language.<br />

It focuses initially on evaluation data preparation and annotation issues, subsequently<br />

providing a complete system description. The chapter also presents an evaluation <strong>of</strong><br />

the English inclusion classifier and its components, as well as its performance on two<br />

unseen datasets. The results show that the classifier performs well on new data in dif-<br />

ferent domains and compares well to another state-<strong>of</strong>-the-art mixed-lingual language<br />

identification approach. The penultimate section describes and discusses parameter<br />

tuning experiments conducted to determine the optimal settings for the classifier. Fi-<br />

nally, the English inclusion classifier is compared to a supervised machine learner.<br />

Chapter 4: System Extension to a New Language describes the adaptation <strong>of</strong> the<br />

classifier to process French text containing English inclusions. The aim <strong>of</strong> this chapter<br />

is to illustrate the ease with which the system can be adapted to deal with a new base<br />

language. The chapter first describes data preparation and then explains the work in-<br />

volved in extending various system modules. Finally, a detailed evaluation on unseen<br />

test data and a comparison <strong>of</strong> the classifier’s performance across languages is presented<br />

and discussed. The results show that the English inclusion classifier not only performs<br />

well on new data in different domains but also successfully fulfils its purpose in differ-<br />

ent language scenarios.<br />

Chapter 5: Parsing English Inclusions concentrates on applying the techniques de-<br />

veloped in the previous two chapters to a real-world task. This chapter presents a series<br />

<strong>of</strong> experiments on English inclusions and a set <strong>of</strong> random test suites using a treebank-<br />

induced and a hand-crafted rule-based German grammar parser. The aim here is to<br />

investigate the difficulty that state-<strong>of</strong>-the-art parsers have with sentences containing<br />

foreign inclusions, thereby determining the reasons for inaccuracy by means <strong>of</strong> error<br />

analysis and identifying appropriate ways <strong>of</strong> improving parsing performance. The ul-<br />

timate goal <strong>of</strong> this chapter is to highlight the <strong>of</strong>t-forgotten issue <strong>of</strong> English inclusions<br />

to researchers in the parsing community and motivate them to identify ways <strong>of</strong> dealing<br />

with inclusions by demonstrating the potential gains in parsing quality.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!