extended - EEMCS EPrints Service - Universiteit Twente

More documents

Recommendations

Info

Imitation and Mirror Neurons: An Evolutionary Robotics Model 253 3.1.3 Task and environment The task the adaptive agents have to master during their lifetime is the execution of correct actions for given states of the world they inhabit. The correct action for each state is not initially known to the adaptive agents; they can only infer it by observing the action that is being executed by another, non-evolving, agent and learning to link this action to the state the world is currently in. More specifically, the world the adaptive agents inhabit can be in two different possible world states, world state 0 and world state 1. Each world state has a different action associated with it, but which action is associated with which world state is not fixed. The two actions are walking (1) in north-east direction or (2) in south-west direction. The correct action for the current world state is always being executed by a demonstrator creature, co-inhabiting the world with the adaptive agent. The adaptive agent is equipped with two world state sensors with which it can sense the current state of the world. One sensor is responsive to world state 0, whereas the other responds to world state 1. When the actual world state is equal to the state the neuron is sensing for, the sensor’s output will be 1.0. When this is not the case, the sensor’s output will be 0.0. In addition to the world state sensors, the adaptive agents are equipped with two joint rotation sensors, sensing the rotation of certain joints in the demonstrator creature. More specifically, the rotation in the X-dimension is sensed for joints 1 and 9, respectively. These specifics were chosen after analysis of the demonstrator’s locomotion revealed that these variables were the only ones showing variation with time; in other words, all the other joints were not controlled by muscles and were completely stiff. X-rotation was found to always lie within the interval [− π 3π 2 , 2 ], and to make this compatible with the neural network, requiring activations in the range [0, 1], this value is scaled. The scaled output of the joint rotation sensors is given by: output = rotation + 2π 1 (9) 4 It should be noted that, through their sensors, the adaptive agents have no privileged access to any of the internal workings of the demonstrator; only to its external behavior. 3.2 The evolutionary process A population consisting of 20 adaptive agents, for which the properties of the adaptive network’s synapses were randomly initialized, was subjected to evolutionary optimization. To prevent a correct world-state-toaction mapping from being genetically ‘learned’, each genotype was evaluated during two lifetimes, one for each possible mapping. At the beginning of each lifetime, the weights were initialized according to the initial weight values encoded in the genotype, so no properties acquired during the first lifetime could be expressed in the second. Each lifetime took 60 000 time steps. The world state was changed every 2 000 time steps. The world state was visible to the agent in 70% of its lifetime and invisible in 30% of its lifetime. The demonstrator executing the proper action was visible to the agent in 67% of its lifetime and invisible in 33% of its lifetime. Furthermore, the world state and the demonstrator’s joint rotation were always visible in the first 12 000 time steps of the agent’s life, simulating an infancy phase. These values ensure a good mix of different conditions is experienced during each lifetime. During the evolutionary simulation of a single generation, each genotype is assigned an error score, rather than a fitness value. Fitness values are, for technical reasons, linearly computed from these error scores only at the end of each generation. Except during infancy, an agent’s error score is updated just before each world state change. The angle (in the range [0, 2π)) of the path traveled by the agent is computed, as well as the angle of the path traveled by the demonstrator creature. The squared difference between these values is added to the error score. When the world state is invisible, the error score is updated by instead adding half of the absolute distance traveled by the agent to it. Summarizing, the fitness function rewards agents that perform the correct action for a visible world state and do nothing at all when the world state is invisible. At the end of a generation, the next generation is created by making two exact replicas of the best genotype and selecting eighteen parent genotypes for reproduction according to a roulette wheel selection mechanism. Mutation occurs with a 60% chance; a mutated genotype will have a single property value changed to a random value within the valid range for that property type. The relatively high mutation chance of 60% ensures that enough genetic variation is introduced into the population, although offspring genotypes will still resemble their parent genotypes to a large degree, because each genotype has a large number of properties that can be mutated and, at most, only one of them is mutated during replication.
254 Eelke Spaak and Pim F.G. Haselager 3.3 Results 3.3.1 Fitness Figure 2: Fitness development over time. The evolutionary simulation was run until 256 generations had been evaluated. The development of the population’s average and maximum fitness value is shown in figure 2. A linear regression analysis was performed to estimate the effect of generation on average fitness. This effect was found to be significant (F (1, 254) = 165.201, p < .001) and strong (r 2 = .394). Average fitness increases with generation (standardized β = .628). Also, a linear regression analysis was performed to estimate the effect of generation on maximum fitness. This effect was found to be significant (F (1, 254) = 152.563, p < .001) and strong (r 2 = .375) as well. Maximum fitness increases with generation (standardized β = .613). These results indicate that the evolutionary simulation was successful; agents capable of imitative behavior have evolved. 3.3.2 Neurodynamics In order to determine whether or not neurons with mirror-like properties have evolved, the dynamics of the adaptive neural controller of the best individual of the last generation were analyzed. Interpreted in terms of a neural network working with activation values rather than action potentials, a ‘mirror neuron’ should be defined as a neuron showing the same activation pattern during execution of a particular action as during observation of that action. During an agent’s lifetime, it has to execute without observation when the world state is visible, but the demonstrator is invisible. Observation without execution occurs when the world state is invisible and the demonstrator is visible. Activation patterns of the seven neurons in the adaptive network, in both these conditions, for both world states, are shown in figure 3. To quantify the mirror neuron definition in terms of statistical analyses of activation patterns, a mirror neuron is defined as a neuron showing a very strong correlation (r > 0.85) between its activation pattern in the ‘observation, no execution’ condition and its activation pattern in the ‘execution, no observation’ condition. Furthermore, the regression line relating the two activation patterns is required to have an intercept close to zero (Bconstant < 0.05). Calculating these measures for all the neurons in both world states revealed that neuron 6 satisfies the mirror neuron criteria in world state 0 (r = 0.910, Bconstant = 0.033) and that neuron 1 satisfies the mirror neuron criteria in world state 1 (r = 0.885, Bconstant = 0.030). No other neurons were found to satisfy the criteria. It should be noted that neurons with a fixed activation pattern, i.e., unresponsive to input variation due to very low incoming connection weights, could evolve in an evolutionary simulation. These types of neurons would also satisfy the criteria specified above, but the correlation between the activation patterns for different world states would also exceed the threshold. The neurons we found that satisfied the criteria did not show an above-threshold correlation between activation patterns for different world states, so we believe the conclusion that the findings are not due to chance is justified. A more formal error rate analysis of the procedure described above would be welcome, but is beyond the scope of this paper. 4 Conclusion and discussion In the present model, artificial neurons emerged that have mirror-like properties. This shows that a selective pressure for imitative behavior can be sufficient for the evolution of mirror neurons. The primary function
Page 2 and 3:
BNAIC 2008 Belgian-Dutch Conference
Page 4 and 5:
Preface This book contains the proc
Page 6 and 7:
1 http://www.ctit.utwente.nl 2 http
Page 8 and 9:
Contents Full Papers Actor-Agent Ba
Page 10 and 11:
Extended Abstracts An Architecture
Page 12 and 13:
Single-Player Monte-Carlo Tree Sear
Page 14:
Full Papers BNAIC 2008
Page 17 and 18:
2 Erwin J.W. Abbink et al. been org
Page 19 and 20:
4 Erwin J.W. Abbink et al. the disr
Page 21 and 22:
6 Erwin J.W. Abbink et al. Phase 4:
Page 23 and 24:
8 Erwin J.W. Abbink et al. emergenc
Page 25 and 26:
10 Sander Bakkes, Pieter Spronck an
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
18 Maurice Bergsma and Pieter Spron
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
26 Guido Boella, Leendert van der T
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
34 Janneke H. Bolt and Linda C. van
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
42 Rogier Brussee and Christian War
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
50 Adam Cornelissen and Franc Groot
Page 67 and 68:
Page 69 and 70:
Page 72 and 73:
Hierarchical Planning and Learning
Page 74 and 75:
Page 76 and 77:
Page 78 and 79:
Page 80 and 81:
Mixed-Integer Bayesian Optimization
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
From Probabilistic Horn Logic to Ch
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Visualizing Co-occurrence of Self-O
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Linguistic Relevance in Modal Logic
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
Beating Cheating: Dealing with Coll
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
The Influence of Physical Appearanc
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
Page 128 and 129:
Discovering the Game in Auctions Mi
Page 130 and 131:
Discovering the Game in Auctions 11
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
Maximizing Classifier Yield for a g
Page 138 and 139:
Maximizing Classifier Utility for a
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Stigmergic Landmarks Lead The Way N
Page 146 and 147:
Stigmergic Landmarks Lead the Way 1
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Distribute the Selfish Ambitions Xi
Page 154 and 155:
Distribute the Selfish Ambitions 13
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Closing the Information Loop Jan Wi
Page 162 and 163:
Closing the Information Loop. 147 L
Page 164 and 165:
Closing the Information Loop. 149 S
Page 166:
Closing the Information Loop. 151 F
Page 169 and 170:
154 Stefan A. van der Meer, Iris va
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
162 Matthijs Melissen Intuitively w
Page 179 and 180:
164 Matthijs Melissen Residuation r
Page 181 and 182:
166 Matthijs Melissen ai → bi f
Page 183 and 184:
168 Matthijs Melissen and is theref
Page 185 and 186:
170 Mihail Mihaylov, Ann Nowé and
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
Page 193 and 194:
178 James Mostert and Vera Hollink
Page 195 and 196:
Page 197 and 198:
Page 199 and 200:
Page 201 and 202:
186 Athanasios K. Noulas and Ben J.
Page 203 and 204:
Page 205 and 206:
Page 208 and 209:
Human Gesture Recognition using Spa
Page 210 and 211:
Page 212 and 213:
Page 214 and 215:
Page 216 and 217:
Determining Resource Needs of Auton
Page 218 and 219: Determining Resource Needs of Auton
Page 224 and 225: Categorizing Children Automated Tex
Page 226 and 227: Categorizing Children: Automated Te
Page 228 and 229: Categorizing Children: Automated Te
Page 230: Categorizing Children: Automated Te
Page 233 and 234: 218 Mannes Poel, Egwin Boschman and
Page 241 and 242: 226 Marc Ponsen et al. Therefore, t
Page 243 and 244: 228 Marc Ponsen et al. is playing t
Page 245 and 246: 230 Marc Ponsen et al. (a) (b) (c)
Page 247 and 248: 232 Marc Ponsen et al. on the switc
Page 249 and 250: 234 Steven Roebert, Tijn Schmits an
Page 257 and 258: 242 Maarten van Someren, Martin Poo
Page 265 and 266: 250 Eelke Spaak and Pim F.G. Hasela
Page 267: 252 Eelke Spaak and Pim F.G. Hasela
Page 271 and 272: 256 Eelke Spaak and Pim F.G. Hasela
Page 273 and 274: 258 Ivo Swartjes and Mariët Theune
Page 281 and 282: 266 Gerben K.D. de Vries, Véroniqu
Page 289 and 290: 274 Arlette van Wissen, Jurriaan va
Page 298 and 299: An Architecture for Peer-to-peer Re
Page 300 and 301: Enhancing the Performance of Maximu
Page 302 and 303: Modeling the Dynamics of Mood and D
Page 304 and 305: A Tractable Hybrid DDN-POMDP approa
Page 306 and 307: An Algorithm for Semi-Stable Semant
Page 308 and 309: Towards an Argument Game for Stable
Page 310 and 311: Temporal Extrapolation within a Sta
Page 312 and 313: Approximating Pareto Fronts by Maxi
Page 314 and 315: A Probabilistic Model for Generatin
Page 316 and 317: Self-organizing Mobile Surveillance
Page 318 and 319:
Engineering Large-scale Distributed
Page 320 and 321:
A Cognitive Model for the Generatio
Page 322 and 323:
Opponent Modelling in Automated Mul
Page 324 and 325:
Exploring Heuristic Action Selectio
Page 326 and 327:
Individualism and Collectivism in T
Page 328 and 329:
Agents Preferences in Decentralized
Page 330 and 331:
Agent-based Patient Admission Sched
Page 332 and 333:
An Empirical Study of Instance-base
Page 334 and 335:
The Importance of Link Evidence in
Page 336 and 337:
Evolutionary Dynamics for Designing
Page 338 and 339:
Combining Expert Advice Efficiently
Page 340 and 341:
Paying Attention to Symmetry Gert K
Page 342 and 343:
Of Mechanism Design and Multiagent
Page 344 and 345:
Metrics for Mining Multisets 1 Walt
Page 346 and 347:
A Hybrid Approach to Sign Language
Page 348 and 349:
Improved Situation Awareness for Pu
Page 350 and 351:
Authorship Attribution and Verifica
Page 352 and 353:
Agent Performance in Vehicle Routin
Page 354 and 355:
Design and Validation of HABTA: Hum
Page 356 and 357:
Improving People Search Using Query
Page 358 and 359:
The tOWL Temporal Web Ontology Lang
Page 360 and 361:
A Priced Options Mechanism to Solve
Page 362 and 363:
Autonomous Scheduling with Unbounde
Page 364 and 365:
1 Introduction Don’t Give Yoursel
Page 366 and 367:
Audiovisual Laughter Detection Base
Page 368 and 369:
P 3 C: A New Algorithm for the Simp
Page 370 and 371:
OperA and Brahms: a symphony? 1 Int
Page 372 and 373:
Monitoring and Reputation Mechanism
Page 374 and 375:
Subjective Machine Classifiers Denn
Page 376 and 377:
Single-Player Monte-Carlo Tree Sear
Page 378 and 379:
Mental State Abduction of BDI-Based
Page 380 and 381:
Decentralized Performance-aware Rec
Page 382 and 383:
Combined Support Vector Machines an
Page 384 and 385:
Reconfiguration Management of Crisi
Page 386 and 387:
Decentralized Online Scheduling of
Page 388 and 389:
Polynomial Distinguishability of Ti
Page 390 and 391:
Decentralized Learning in Markov Ga
Page 392 and 393:
Organized Anonymous Agents ∗ Mart
Page 394 and 395:
Topic Detection by Clustering Keywo
Page 396 and 397:
Modeling Agent Adaptation in Games
Page 398 and 399:
Monte-Carlo Tree Search Solver 1 Ma
Page 400:
Demonstrations BNAIC 2008
Page 403 and 404:
388 Joost Batenburg and Walter Kost
Page 405 and 406:
390 Guillaume Chaslot, Sander Bakke
Page 407 and 408:
392 Dennis Hofs, Mariët Theune and
Page 409 and 410:
394 François L.A. Knoppel, Almer S
Page 411 and 412:
396 Dirkjan Krijnders and Tjeerd An
Page 413 and 414:
398 Joris Maervoet, Patrick De Caus
Page 415 and 416:
400 Thomas Mensink and Jakob Verbee
Page 417 and 418:
402 Daniel Okouya and Virginia Dign
Page 419 and 420:
404 Roeland Ordelman et al. After t
Page 421 and 422:
406 Dennis Reidsma and Anton Nijhol
Page 423 and 424:
408 Michel F. Valstar, Simon Colton
Page 425 and 426:
410 Tim Verwaart and John Wolters 2
Page 427 and 428:
K Kaisers, Michael . . . . . . . .
show all

extended - EEMCS EPrints Service - Universiteit Twente

Create successful ePaper yourself

Delete template?

Save as template?