TESIS DOCTORAL - Robotics Lab - Universidad Carlos III de Madrid

More documents

Recommendations

Info

186 Chapter 9. Experimental Resultsgreat improve in the average values. Nevertheless, the impact on the number of times therobot is damaged is outstanding.Consequently, the most relevant result of using fear is related to the damage caused byAlvaro to the robot when it “lives” according to the learned policy of behavior. When fearis not implemented, the robot tries to interact with both users in order to satisfy its socialneed. This action leads Maggie to, some times, be harmed by Alvaro because it has notlearned to identify that being next to Alvaro is dangerous. Consequently, it has not learnedan avoidance behavior. As depicted in Table 9.4, this happens six times of twenty-threeinteractions between Maggie and Alvaro. Since damages heavily affect the social drive,these greatly affect the wellbeing results. For this reason, although the average wellbeingis better without fear, the rest of the performance indicators when fear does not exist aredisturbed when Maggie is damaged and, as result, their values are worse.Now, considering fear as a motivation in the system, once the presence of Alvaro isidentified as dangerous, the robot does not interact with Alvaro at all so he could not hurtit. This is because, as shown in previous sections, the robot learned to avoid the interactionwith Alvaro. Focusing again in Table 9.4, by means of fear, the dangerous situations aretotally averted. In fact, the robot has not been damaged any more when fear is implemented.Therefore, fear improves the performance of the robot since it provides a safety mechanismto avoid situations where the robot can be damaged.Table 9.4: Harm/interactions with Alvaro during the exploiting sessionswithout fearwith fear6/23 0/0In conclusion, despite of the fact that the average wellbeing is hardly worse, fear providessignificant benefits. Specially the fact that harm is totally avoided.9.3 Learning behaviorsAs presented in Section 4.4.1, Happiness and sadness are artificial emotions coming upfrom the variation of the robot’s wellbeing. They are used as the reward function duringthe learning of the policy of behavior. Therefore, the robot’s behavior in all circumstancesis oriented towards increasing its wellbeing.The robot Maggie has been learning in sessions which last more than seven hours inthe laboratory. In this section, the learned behaviors are analyzed. During the learning, therobot has learned how to act according to its state (internal and external). As explained inSection 6.2.1, the internal state corresponds to the dominant motivation, and the externalis related to different objects. Through learning, stable chains of actions have been formed
9.3. Learning behaviors 187and they can be considered as patterns of behavior corresponding to the motivations. Inthis section, the learned behaviors are independently presented motivation by motivation.The behaviors exhibited when fear is the dominant motivation have been already shownin Sections 9.2.2 and 9.2.3. Therefore, they will not be included again in this section.Moreover, the reaction of the robot when there is not a dominant motivation is alsoanalyzed in the last part.9.3.1 The survival motivation. How do I get my batteries recharged?Figure 9.4 displays the Q values related to all the objects in the robot’s world when survivalis the dominant motivation. This means that the need of energy is high. The best action,this is the action with the highest Q value, is charge which is responsible for the totallyrecharging of the batteries. Consequently, the energy required is obtained. For that reason,after this action has finished, the energy drive is satiated. Then, this action is the most likelyto be executed. It is the consummatory action for the survival motivation.The go to player action is very high too because the next best action is the charge action.This action is executed when the robot is unplugged and far from the docking station. Thissituation results after the execution of the go to player action.It is worth mentioning why remaining plugged is not a good strategy in this situation,although it would seem a contradiction. Since the remain action just can be executed whenthe robot is plugged and this is after the charge action, it implies that the robot’s battery islikely full and, consequently, remain does not contribute anything because survival will notbe the dominant motivation at that situation, so the robot’s wellbeing does not augments.Moreover, the amount of time this action lasts is not enough for a significant contribution tothe level of energy. Concurrently, other drives increase a bit and therefore the variation ofthe robot’s wellbeing is negative. Then, the value of this action is not good. In fact, remainhas been executed when survival is the dominant motivation just when, due to the WellbalancedExploration mechanism (Section 6.3.1), energy has been artificially saturated andMaggie was plugged.The rest of actions are slightly positive because they provide little benefits in otherdrives different than the energy drive which is the one related to survival motivation. Theactions that reduce the energy drive have the highest values.9.3.2 The fun motivation. Let’s enjoy!In this case, the dominant motivation is fun. Then, the robot needs to satisfy the need ofentertainment through the dance action (the consummatory action), which is the best action(Figure 9.5). For dancing, music must be on, so play music is the second better action dueto the collateral effects of this action. Moreover, the idle action when music is off and itis close to the cd player is good too because the next best action with the cd player is to
Page 1:
TESIS DOCTORALBIO-INSPIRED DECISION
Page 7 and 8:
AgradecimientosSon muchas las veces
Page 9 and 10:
AbstractRobotics is an emergent fie
Page 11 and 12:
ResumenLa robótica es un área eme
Page 13 and 14:
ContentsAgradecimientosAbstractResu
Page 15 and 16:
5 The social robot Maggie and its d
Page 18 and 19:
9.4 Harm/interactions with Alvaro d
Page 20 and 21:
3.10 An overview of the net of syst
Page 23:
List of Algorithms6.1 Object Q-Lear
Page 26 and 27:
xxii
Page 28 and 29:
2 Chapter 1. IntroductionFigure 1.1
Page 30 and 31:
4 Chapter 1. Introductionautonomous
Page 32 and 33:
6 Chapter 1. IntroductionAs in othe
Page 34 and 35:
8 Chapter 1. Introductiondesired ou
Page 36 and 37:
10 Chapter 1. Introduction1.4 Overv
Page 38 and 39:
12 Chapter 1. Introduction
Page 40 and 41:
14 Chapter 2. Biological foundation
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
Page 54 and 55:
Page 56 and 57:
Page 58 and 59:
Page 60 and 61:
Page 62 and 63:
Page 64 and 65:
38 Chapter 3. State of the Artand b
Page 66 and 67:
40 Chapter 3. State of the Art(a) R
Page 68 and 69:
42 Chapter 3. State of the Artpatie
Page 70 and 71:
44 Chapter 3. State of the Art(a) i
Page 72 and 73:
46 Chapter 3. State of the Artrange
Page 74 and 75:
48 Chapter 3. State of the Artwell
Page 76 and 77:
50 Chapter 3. State of the Artthe a
Page 78 and 79:
52 Chapter 3. State of the ArtThe e
Page 80 and 81:
54 Chapter 3. State of the Arttask.
Page 82 and 83:
56 Chapter 3. State of the Artthe r
Page 84 and 85:
58 Chapter 3. State of the Artit is
Page 86 and 87:
60 Chapter 3. State of the Artthe r
Page 88 and 89:
62 Chapter 3. State of the Artnon-l
Page 90 and 91:
64 Chapter 3. State of the ArtTAME
Page 92 and 93:
66 Chapter 3. State of the ArtMinsk
Page 94 and 95:
68 Chapter 3. State of the Art
Page 96 and 97:
70 Chapter 4. The Decision Making S
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
Page 114 and 115:
88 Chapter 5. The social robot Magg
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
100 Chapter 5. The social robot Mag
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
110 Chapter 6. Learning to make dec
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
128 Chapter 7. Implementing the dec
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163: 136 Chapter 7. Implementing the dec
Page 192 and 193: 166 Chapter 8. Testing the experime
Page 204 and 205: 178 Chapter 9. Experimental Results
Page 224 and 225: 198 Chapter 10. Conclusions and Fut
Page 234 and 235: 208 Bibliography[8] M. A. Martínez
Page 236 and 237: 210 Bibliography[35] B. Hardy-Vall
Page 238 and 239: 212 Bibliography[63] J. LeDoux, “
Page 240 and 241: 214 Bibliography[90] C. Bartneck an
Page 242 and 243: 216 Bibliography[115] B. Graf, U. R
Page 244 and 245: 218 Bibliography[140] W. P. Lee, J.
Page 246 and 247: 220 Bibliography[166] C. Isbell, C.
Page 248 and 249: 222 Bibliography[190] M. A. Salichs
show all

TESIS DOCTORAL - Robotics Lab - Universidad Carlos III de Madrid

Create successful ePaper yourself

Delete template?

Save as template?