PhD thesis - School of Informatics - University of Edinburgh

More documents

Recommendations

Info

Chapter 4. System Extension to a New Language 111 POS tagger yields the best results for a set of German data sets in different domains (Section 3.5.1). TnT assigns STTS tags to German text (Schiller et al., 1995). The English inclusion classifier is set up to process any token in the German data with the tag: NN (common noun), NE (proper noun), ADJA or ADJD (attributive and adverbial or predicatively used adjectives) as well as FM (foreign material). The French data, on the other hand, is tagged with the TreeTagger whose POS tag set differs to STTS. Although it also differentiates between common nouns (NOM) and proper names (NAM), it only has one tag for adjectives (ADJ). Moreover, the French tag set contains an additional abbreviation tag (ABR). It does not, however, contain a separate tag for foreign material. Despite the fact that TnT is not very accurate in identifying foreign material in the German data, this additional information is likely to have a positive effect on the overall performance of the German system. The full system scores show that post-processing and document consistency checking improve the overall system performance considerably for both languages. For French, the individual post-processing rules are evaluated in more detail in the next section. Overall, Table 4.3 shows that the improvements are relatively similar on both the development set and the test set for each language. The full French system performs only 1.85 points lower in F-score on the test set (86.28 points) compared to the development set (88.13 points). The full system scores for German are also very similar. Within each language, the classifier therefore produces consistent results. Overall, the full French system performs slightly better than the German one. Ta- ble 4.3 illustrates that this difference is mainly due to the larger gap between recall and precision for the full German system. Even though the full German system performs better in precision than the French one, its recall is much lower, causing the overall F-score to drop. This discrepancy is due to language-specific post-processing differences as post-processing rules are generated on the basis of error analysis on the development data. However, comparing the results of the two systems is not entirely straightforward because they are not completely identical in parts of their components and the data sets are inherently not identical. Despite these differences, the fact that both systems yield high F-scores demonstrates that the underlying concept of identifying English inclusions in text can be applied to different language scenarios, at least those with Latin alphabets.
Chapter 4. System Extension to a New Language 112 4.4.2 Evaluation of the Post-Processing Module Table 4.4 presents lesion studies showing the individual contribution of different types of post-processing rules to the overall performance of the full French system on the development data. In this case, the full system does not include document consistency checking. A detailed description of the post-processing module design is given in Section 4.3.4. The results show that the biggest improvement is due to the post-processing of single-character tokens which are not classified by the core system. Switching off this type of post-processing leads the full system to perform 7.16 points lower in F-score. The second largest improvement in F-score is achieved by the post-processing rules dealing with ambiguous words, i.e. those that are classified as either French or English by the core system. Identifying the language of such tokens based on the language of their surrounding context helps to improve the overall performance by 5.73 points in F-score. Comparing the results of each of the lesion study experiments to the results of the full system, where all post-processing rules are applied, also shows that most post-processing rules are designed to improve recall. The only post-processing rule implemented to improve precision without deterio- rating recall is that for person names. It results in a smaller but nevertheless statistically significant increase (χ 2 : d f = 1, p ≤ 0.001) of 2.39 points in F-score. Although all remaining post-processing rules do not yield statistically significant performance improvements, none of the post-processing rules lead to a decrease in F-score as is illus- trated in the last column. In the final run of the full French system on the test data, the post-processing module results in a large performance increase of 11.13 points in F- score. Therefore, it can be concluded that the post-processing is designed well enough to apply to new data in the same domain and language. 4.4.3 Consistency Checking In order to guarantee consistent language classification within each document, additional consistency checking was implemented in the French system. This second classification run functions the same way as that implemented in the German system, de- scribed in Section 3.3.6. Tokens that are identified as English by the full system after
Page 1 and 2:
Automatic Detection of English Incl
Page 3 and 4:
these parsers with the annotation-f
Page 5 and 6:
Declaration I declare that this the
Page 7 and 8:
3.3.5 Post-processing Module . . .
Page 9 and 10:
A.2.2 Kappa Coefficient . . . . . .
Page 11 and 12:
5.6 Average relative token frequenc
Page 13 and 14:
3.16 Most frequent English inclusio
Page 15 and 16:
Chapter 1. Introduction 2 siderable
Page 17 and 18:
Chapter 1. Introduction 4 Chapter 3
Page 19 and 20:
Chapter 1. Introduction 6 1.1 Relat
Page 21 and 22:
Chapter 2. Background and Theory 8
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Chapter 3 Tracking English Inclusio
Page 61 and 62:
Chapter 3. Tracking English Inclusi
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74: Chapter 3. Tracking English Inclusi
Page 113 and 114: Chapter 4 System Extension to a New
Page 115 and 116: Chapter 4. System Extension to a Ne
Page 123: Chapter 4. System Extension to a Ne
Page 129 and 130: Chapter 5 Parsing English Inclusion
Page 131 and 132: Chapter 5. Parsing English Inclusio
Page 159 and 160: Chapter 6 Other Potential Applicati
Page 161 and 162: Chapter 6. Other Potential Applicat
Page 175 and 176:
Chapter 6. Other Potential Applicat
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Chapter 7 Conclusions and Future Wo
Page 189 and 190:
Chapter 7. Conclusions and Future W
Page 191 and 192:
Appendix A. Evaluation Metrics and
Page 193 and 194:
Page 195 and 196:
Page 197 and 198:
Page 199 and 200:
Appendix B. Guidelines for Annotati
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
Appendix C TIGER Tags and Labels C.
Page 207 and 208:
Appendix C. TIGER Tags and Labels 1
Page 209 and 210:
Appendix C. TIGER Tags and Labels 1
Page 211 and 212:
Bibliography 198 Andersen, G. (2005
Page 213 and 214:
Bibliography 200 Bresnan, J. (2001)
Page 215 and 216:
Bibliography 202 Damashek, M. (1995
Page 217 and 218:
Bibliography 204 Finkel, J., Dingar
Page 219 and 220:
Bibliography 206 Hachey, B., Alex,
Page 221 and 222:
Bibliography 208 Kirkness, A. (1984
Page 223 and 224:
Bibliography 210 and Technology (In
Page 225 and 226:
Bibliography 212 Poplack, S. (1988)
Page 227 and 228:
Bibliography 214 Sokol, D. K. (2000
Page 229:
Bibliography 216 Yang, W. (1990). A
show all

PhD thesis - School of Informatics - University of Edinburgh

Create successful ePaper yourself

Delete template?

Save as template?