PhD thesis - School of Informatics - University of Edinburgh

More documents

Recommendations

Info

Chapter 3. Tracking English Inclusions in German 69 Domain Method Accuracy Precision Recall F-score Internet Baseline 93.95% - - - Full system 98.25% 92.75% 77.37% 84.37 TextCat 92.24% 33.57% 28.87% 31.04 Space Baseline 96.99% - - - Full system 99.45% 89.19% 93.61% 91.35 TextCat 93.80% 20.73% 37.32% 26.66 EU Baseline 99.69% - - - Full system 99.78% 59.26% 76.19% 66.67 TextCat 96.43% 2.54% 28.57% 4.66 Table 3.8: Performance of the English inclusion classifier compared to the baseline and the performance of TextCat. In order to get an idea of how a conventional LID system performs on the task of recognising English inclusions embedded in German text, Table 3.8 also reports the performance of TextCat, an automatic LID tool based on the character n-gram frequency text categorisation algorithm proposed by Cavnar and Trenkle (1994) and reviewed in Section 2.2. While this LID tool requires no lexicons, its F-scores are low for the internet and space travel domains (31.04 and 26.66, respectively) and very poor for the EU data (4.66). This confirms that the identification of English inclusions is more difficult for this domain, coinciding with the result of the English inclusion classifier. The low scores also prove that such conventional n-gram-based language identification alone is unsuitable for token-based language classification, particularly in case of closely related languages. 3.4.2 Evaluation of Individual System Modules The full system described in Section 3.3 combines a lexicon lookup module, a search engine module and a post-processing module in order to classify English inclusions in German text. This section reports the performance of individual system modules of the English inclusion classifier compared to those of the full system and the baseline scores. It shows that the combination of individual models leads to a performance increase of the system on mixed-lingual data.
Chapter 3. Tracking English Inclusions in German 70 3.4.2.1 Evaluation of the Lexicon and Search Engine Modules In the first experiment, the system is limited to the lexicon module described in de- tail in Section 3.3.3. Lexicon lookup is restricted to tokens with the POS tags NN, NE, FM, ADJA and ADJD. Post-processing and document consistency checking, as carried out in the full system and described in Sections 3.3.5 and 3.3.6, are not applied here. Therefore, ambiguous tokens found in neither or both databases are considered not to be of English origin by default. The assumption is that the lexicon module performs relatively well on known words contained in the lexicons but will disregard all tokens not found in the lexicons as potential English inclusions. Therefore, precision is expected to be higher than recall. In the second experiment, the system is restricted to the search engine module only. Here, all tokens (with the POS tags NN, NE, FM, ADJA and ADJD) are classified by the search engine module based on the number of normalised hits returned for each language. Exact details on how this module functions are pre- sented in Section 3.3.4. This experiment also does not involve any post-processing. As all queried tokens are treated as potential English inclusions, recall is expected to increase. Since some tokens are named entities which are difficult to classify as being of a particular language origin, precision is likely to decrease. As anticipated, recall scores are low for the lexicon-only-evaluation across all domains (Internet: R=23.04%, Space: R=28.87%, EU: R=38.10%). These are due to the considerable number of false negatives, i.e. English inclusions that do not occur in the lexicon (unknown words). Conversely, Table 3.9 shows higher precision values for the lexicon module across all three domains (Internet: P=90.57%, Space: P=77.78%, EU: P=47.06%). In the search engine module evaluation, recall scores improve con- siderably, as expected (Internet: R=81.02%, Space: R=97.11%, EU: R=88.10%). On the other hand, this latter setup results in much lower precision scores (Internet: P=68.82%, Space: P=40.71%, EU: P=6.99%) which is partly due to the fact that Ya- hoo, as most search engines, is not sensitive to linguistic and orthographic information such as POS tags or case. For example, the German noun All (space) is classified as English because the search engine mistakes it for the English word “all” which is much more commonly used on the internet than its German homograph. Interlingual homographs are therefore often wrongly classified as English when running the search engine module on its own.
Page 1 and 2:
Automatic Detection of English Incl
Page 3 and 4:
these parsers with the annotation-f
Page 5 and 6:
Declaration I declare that this the
Page 7 and 8:
3.3.5 Post-processing Module . . .
Page 9 and 10:
A.2.2 Kappa Coefficient . . . . . .
Page 11 and 12:
5.6 Average relative token frequenc
Page 13 and 14:
3.16 Most frequent English inclusio
Page 15 and 16:
Chapter 1. Introduction 2 siderable
Page 17 and 18:
Chapter 1. Introduction 4 Chapter 3
Page 19 and 20:
Chapter 1. Introduction 6 1.1 Relat
Page 21 and 22:
Chapter 2. Background and Theory 8
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32: Chapter 2. Background and Theory 18
Page 59 and 60: Chapter 3 Tracking English Inclusio
Page 61 and 62: Chapter 3. Tracking English Inclusi
Page 81: Chapter 3. Tracking English Inclusi
Page 113 and 114: Chapter 4 System Extension to a New
Page 115 and 116: Chapter 4. System Extension to a Ne
Page 129 and 130: Chapter 5 Parsing English Inclusion
Page 131 and 132: Chapter 5. Parsing English Inclusio
Page 133 and 134:
Chapter 5. Parsing English Inclusio
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Chapter 6 Other Potential Applicati
Page 161 and 162:
Chapter 6. Other Potential Applicat
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Chapter 7 Conclusions and Future Wo
Page 189 and 190:
Chapter 7. Conclusions and Future W
Page 191 and 192:
Appendix A. Evaluation Metrics and
Page 193 and 194:
Page 195 and 196:
Page 197 and 198:
Page 199 and 200:
Appendix B. Guidelines for Annotati
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
Appendix C TIGER Tags and Labels C.
Page 207 and 208:
Appendix C. TIGER Tags and Labels 1
Page 209 and 210:
Appendix C. TIGER Tags and Labels 1
Page 211 and 212:
Bibliography 198 Andersen, G. (2005
Page 213 and 214:
Bibliography 200 Bresnan, J. (2001)
Page 215 and 216:
Bibliography 202 Damashek, M. (1995
Page 217 and 218:
Bibliography 204 Finkel, J., Dingar
Page 219 and 220:
Bibliography 206 Hachey, B., Alex,
Page 221 and 222:
Bibliography 208 Kirkness, A. (1984
Page 223 and 224:
Bibliography 210 and Technology (In
Page 225 and 226:
Bibliography 212 Poplack, S. (1988)
Page 227 and 228:
Bibliography 214 Sokol, D. K. (2000
Page 229:
Bibliography 216 Yang, W. (1990). A
show all

PhD thesis - School of Informatics - University of Edinburgh

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?