extended - EEMCS EPrints Service - Universiteit Twente

More documents

Recommendations

Info

A Neural Network Based Dutch Part of Speech Tagger 219 Category Type Size in words A Face to face conversations 2626172 B Interview with teacher Dutch 565433 C Phone dialogue (recorded at platform) 1208633 D Phone dialogue (recorded with mini disc) 853371 E Business conversations 136461 F Interviews and discussions recorded from radio 790269 G Political debates, discussions and meetings 360328 H Lectures 405409 I Sport commentaries 208399 J Discussions on current events 186072 K News 368153 L Commentaries on radio and TV 145553 M Masses and ceremonies 18075 N Lectures and discourses 140901 O Text read aloud 903043 Table 1: The 15 different categories of the Dutch Spoken Corpus (CGN). Tag numbers Part-of-Speech Tag Tags in the CGN corpus 1 . . . 8 Noun N1, N2, . . ., N8 9 . . . 21 Verb WW1, WW2, . . ., WW13 22 Article LID 23 . . . 49 Pronoun VNW1, VNW2, . . ., VNW27 50, 51 Conjunction VG1, VG2 52 Adverb BW 53 Interjections TSW 54 . . . 65 Adjective ADJ1, ADJ2, . . ., ADJ12 66 . . . 68 Preposition VZ1, VZ2, VZ3 69, 70 Numeral TW1, TW2 71 Punctuation LET 72 Special SPEC Table 2: Tags in the medium-sized tag set of size 72. The items in the second column are the main classes and correspond to the reduced tag set of 12 tags. 2 Design of ANN based tagger In order to design a NN for PoS tagging several issues have to be resolved. First the input features for each word have to be determined and also the window size. In this paper we will focus on NN’s with look ahead, that is features of words succeeding the word under consideration are used to classify the word. 2.1 Determining the input features In conclusion, for every word we use at least the relative frequencies (prior distribution) of PoS tags for that word, of course these relative frequencies are based on the training set. Moreover in the training phase for the words in the window preceding the current word, also the actual PoS tag is used. Of course the test phase the predicted tag is used. For all preliminary tests in order to determine the parameters of the neural network we used set0, set1, set2 and set4 as training set and set10 as validation set. Each set consists of around 100,000 sentences and around 900,000 words. Each training and validation was repeated tree times.
220 Mannes Poel, Egwin Boschman and Rieks op den Akker 2.1.1 Coding of unknown words One problem with relative frequencies (prior distribution of tags) based on the training set is the occurrence of so-called unknown words (words that are in the test set but not in the training set). One option to overcome this problem is to use equal priors, meaning that each component of the feature vector gets the value 1/72, since there are 72 possible tags. We used 10-fold cross validation on the training set to compute the relative frequencies and estimate prior probabilities of unknown words. The results can be found in Table 3. Given Tag nr. Tag Rel. Frequency Tag nr. Tag Rel. Frequency 1 N1 33.0% 15 WW7 2.5% 3 N3 14.0% 54 ADJ1 4.0% 5 N5 10.0% 62 ADJ9 2.4% 9 WW1 2.2% 72 SPEC 19.0% 12 WW4 2.3% Table 3: The relative frequency (prior probability) of unknown words based on a 10-fold cross validation. Only the relative frequencies with value at least 2% are listed. the fact that there are 63 tags with relative frequencies less than 2% we decided not to take into account these tags. Hence every unknown word during testing is coded by the normalized values determined by Table 3. It should be remarked that the tags N3 and N5 do not occur very often in the whole corpus, less than 2%, but they are important tags for unknown words. 2.2 Determining the window size From the research of Marques and Lopes [7] and Schmid [10] one can conclude that a window size of 3 (tags of) words back and 2 words ahead could be reasonable for the input. We constructed and validated 7 Neural Networks with different window sizes, all with a hidden layer consisting of 50 neurons. The validation results can be found in Table 4. The average performances (over 3 runs) all vary around 96% on the validation set (set10). This set consists of 912,660 words (samples). This leads to an estimate of the 95% confidence interval of ±0.022%. Hence we can conclude that a 3 × 2, a 3 × 3 and 4 × 3 (b × a means b words back and a words ahead) window perform the best. Window size 2x2 2x3 3x2 3x3 3x4 4x3 4x4 run1 96.52 96.66 96.53 96.64 96.59 96.62 96.52 run2 96.54 96.51 96.64 96.57 96.53 96.56 96.55 run3 96.45 96.61 96.66 96.55 96.56 96.65 96.63 average 96.503 96.593 96.610 96.587 96.560 96.610 96.567 Table 4: The performance in % for different window sizes, b × a means a window size of b tags back and a tags ahead. Each neural network has a hidden layer of 50 neurons. 2.3 Determining the size of the hidden layer In order to determine the optimal number of hidden neurons we used the smallest window size, 3 × 2, of the previous subsection. Once again the different networks where trained 3 times on the union of set1, set2, set3 and set4 and validated on set10. The average results can be found in Table 5. From Table 5 we can conclude that the Neural Networks with 250 and 370 hidden neurons are the two best performing ones. 2.4 Determining the optimal configuration In order to determine the best configuration we combine the results of the previous two subsections, window size 2×3 or 3×2 or 4×3 and number of hidden neurons 250 or 370. This results in 6 different configuration,
Page 2 and 3:
BNAIC 2008 Belgian-Dutch Conference
Page 4 and 5:
Preface This book contains the proc
Page 6 and 7:
1 http://www.ctit.utwente.nl 2 http
Page 8 and 9:
Contents Full Papers Actor-Agent Ba
Page 10 and 11:
Extended Abstracts An Architecture
Page 12 and 13:
Single-Player Monte-Carlo Tree Sear
Page 14:
Full Papers BNAIC 2008
Page 17 and 18:
2 Erwin J.W. Abbink et al. been org
Page 19 and 20:
4 Erwin J.W. Abbink et al. the disr
Page 21 and 22:
6 Erwin J.W. Abbink et al. Phase 4:
Page 23 and 24:
8 Erwin J.W. Abbink et al. emergenc
Page 25 and 26:
10 Sander Bakkes, Pieter Spronck an
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
18 Maurice Bergsma and Pieter Spron
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
26 Guido Boella, Leendert van der T
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
34 Janneke H. Bolt and Linda C. van
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
42 Rogier Brussee and Christian War
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
50 Adam Cornelissen and Franc Groot
Page 67 and 68:
Page 69 and 70:
Page 72 and 73:
Hierarchical Planning and Learning
Page 74 and 75:
Page 76 and 77:
Page 78 and 79:
Page 80 and 81:
Mixed-Integer Bayesian Optimization
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
From Probabilistic Horn Logic to Ch
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Visualizing Co-occurrence of Self-O
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Linguistic Relevance in Modal Logic
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
Beating Cheating: Dealing with Coll
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
The Influence of Physical Appearanc
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
Page 128 and 129:
Discovering the Game in Auctions Mi
Page 130 and 131:
Discovering the Game in Auctions 11
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
Maximizing Classifier Yield for a g
Page 138 and 139:
Maximizing Classifier Utility for a
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Stigmergic Landmarks Lead The Way N
Page 146 and 147:
Stigmergic Landmarks Lead the Way 1
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Distribute the Selfish Ambitions Xi
Page 154 and 155:
Distribute the Selfish Ambitions 13
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Closing the Information Loop Jan Wi
Page 162 and 163:
Closing the Information Loop. 147 L
Page 164 and 165:
Closing the Information Loop. 149 S
Page 166:
Closing the Information Loop. 151 F
Page 169 and 170:
154 Stefan A. van der Meer, Iris va
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
162 Matthijs Melissen Intuitively w
Page 179 and 180:
164 Matthijs Melissen Residuation r
Page 181 and 182:
166 Matthijs Melissen ai → bi f
Page 183 and 184: 168 Matthijs Melissen and is theref
Page 185 and 186: 170 Mihail Mihaylov, Ann Nowé and
Page 193 and 194: 178 James Mostert and Vera Hollink
Page 201 and 202: 186 Athanasios K. Noulas and Ben J.
Page 208 and 209: Human Gesture Recognition using Spa
Page 216 and 217: Determining Resource Needs of Auton
Page 224 and 225: Categorizing Children Automated Tex
Page 226 and 227: Categorizing Children: Automated Te
Page 228 and 229: Categorizing Children: Automated Te
Page 230: Categorizing Children: Automated Te
Page 233: 218 Mannes Poel, Egwin Boschman and
Page 237 and 238: 222 Mannes Poel, Egwin Boschman and
Page 239 and 240: 224 Mannes Poel, Egwin Boschman and
Page 241 and 242: 226 Marc Ponsen et al. Therefore, t
Page 243 and 244: 228 Marc Ponsen et al. is playing t
Page 245 and 246: 230 Marc Ponsen et al. (a) (b) (c)
Page 247 and 248: 232 Marc Ponsen et al. on the switc
Page 249 and 250: 234 Steven Roebert, Tijn Schmits an
Page 257 and 258: 242 Maarten van Someren, Martin Poo
Page 265 and 266: 250 Eelke Spaak and Pim F.G. Hasela
Page 273 and 274: 258 Ivo Swartjes and Mariët Theune
Page 281 and 282: 266 Gerben K.D. de Vries, Véroniqu
Page 283 and 284: 268 Gerben K.D. de Vries, Véroniqu
Page 285 and 286:
270 Gerben K.D. de Vries, Véroniqu
Page 287 and 288:
272 Gerben K.D. de Vries, Véroniqu
Page 289 and 290:
274 Arlette van Wissen, Jurriaan va
Page 291 and 292:
Page 293 and 294:
Page 295 and 296:
Page 298 and 299:
An Architecture for Peer-to-peer Re
Page 300 and 301:
Enhancing the Performance of Maximu
Page 302 and 303:
Modeling the Dynamics of Mood and D
Page 304 and 305:
A Tractable Hybrid DDN-POMDP approa
Page 306 and 307:
An Algorithm for Semi-Stable Semant
Page 308 and 309:
Towards an Argument Game for Stable
Page 310 and 311:
Temporal Extrapolation within a Sta
Page 312 and 313:
Approximating Pareto Fronts by Maxi
Page 314 and 315:
A Probabilistic Model for Generatin
Page 316 and 317:
Self-organizing Mobile Surveillance
Page 318 and 319:
Engineering Large-scale Distributed
Page 320 and 321:
A Cognitive Model for the Generatio
Page 322 and 323:
Opponent Modelling in Automated Mul
Page 324 and 325:
Exploring Heuristic Action Selectio
Page 326 and 327:
Individualism and Collectivism in T
Page 328 and 329:
Agents Preferences in Decentralized
Page 330 and 331:
Agent-based Patient Admission Sched
Page 332 and 333:
An Empirical Study of Instance-base
Page 334 and 335:
The Importance of Link Evidence in
Page 336 and 337:
Evolutionary Dynamics for Designing
Page 338 and 339:
Combining Expert Advice Efficiently
Page 340 and 341:
Paying Attention to Symmetry Gert K
Page 342 and 343:
Of Mechanism Design and Multiagent
Page 344 and 345:
Metrics for Mining Multisets 1 Walt
Page 346 and 347:
A Hybrid Approach to Sign Language
Page 348 and 349:
Improved Situation Awareness for Pu
Page 350 and 351:
Authorship Attribution and Verifica
Page 352 and 353:
Agent Performance in Vehicle Routin
Page 354 and 355:
Design and Validation of HABTA: Hum
Page 356 and 357:
Improving People Search Using Query
Page 358 and 359:
The tOWL Temporal Web Ontology Lang
Page 360 and 361:
A Priced Options Mechanism to Solve
Page 362 and 363:
Autonomous Scheduling with Unbounde
Page 364 and 365:
1 Introduction Don’t Give Yoursel
Page 366 and 367:
Audiovisual Laughter Detection Base
Page 368 and 369:
P 3 C: A New Algorithm for the Simp
Page 370 and 371:
OperA and Brahms: a symphony? 1 Int
Page 372 and 373:
Monitoring and Reputation Mechanism
Page 374 and 375:
Subjective Machine Classifiers Denn
Page 376 and 377:
Single-Player Monte-Carlo Tree Sear
Page 378 and 379:
Mental State Abduction of BDI-Based
Page 380 and 381:
Decentralized Performance-aware Rec
Page 382 and 383:
Combined Support Vector Machines an
Page 384 and 385:
Reconfiguration Management of Crisi
Page 386 and 387:
Decentralized Online Scheduling of
Page 388 and 389:
Polynomial Distinguishability of Ti
Page 390 and 391:
Decentralized Learning in Markov Ga
Page 392 and 393:
Organized Anonymous Agents ∗ Mart
Page 394 and 395:
Topic Detection by Clustering Keywo
Page 396 and 397:
Modeling Agent Adaptation in Games
Page 398 and 399:
Monte-Carlo Tree Search Solver 1 Ma
Page 400:
Demonstrations BNAIC 2008
Page 403 and 404:
388 Joost Batenburg and Walter Kost
Page 405 and 406:
390 Guillaume Chaslot, Sander Bakke
Page 407 and 408:
392 Dennis Hofs, Mariët Theune and
Page 409 and 410:
394 François L.A. Knoppel, Almer S
Page 411 and 412:
396 Dirkjan Krijnders and Tjeerd An
Page 413 and 414:
398 Joris Maervoet, Patrick De Caus
Page 415 and 416:
400 Thomas Mensink and Jakob Verbee
Page 417 and 418:
402 Daniel Okouya and Virginia Dign
Page 419 and 420:
404 Roeland Ordelman et al. After t
Page 421 and 422:
406 Dennis Reidsma and Anton Nijhol
Page 423 and 424:
408 Michel F. Valstar, Simon Colton
Page 425 and 426:
410 Tim Verwaart and John Wolters 2
Page 427 and 428:
K Kaisers, Michael . . . . . . . .
show all

extended - EEMCS EPrints Service - Universiteit Twente

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?