23.08.2015 Views

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Learning to Improve Agent Behaviours in GOALDhirendra Singh and Koen V. HindriksInteractive Intelligence Group, Delft <strong>University</strong> <strong>of</strong> Technology, The NetherlandsAbstract. This paper investigates the issue <strong>of</strong> adaptability <strong>of</strong> behaviourin the context <strong>of</strong> agent-oriented programming. We focus on improvingaction selection in rule-based agent programming languages using a reinforcementlearning mechanism under the hood. The novelty is thatlearning utilises the existing mental state representation <strong>of</strong> the agent,which means that (i) the programming model is unchanged and usinglearning within the program becomes straightforward, and (ii) adaptivebehaviours can be combined with regular behaviours in a modular way.Overall, the key to effective programming in this setting is to balancebetween constraining behaviour using operational knowledge, and leavingflexibility to allow for ongoing adaptation. We illustrate this usingdifferent types <strong>of</strong> programs for solving the Blocks World problem.Keywords: Agent programming, rule selection, reinforcement learning1 IntroductionBelief-Desire-Intention (BDI) [1] is a practical and popular cognitive frameworkfor implementing practical reasoning in computer programs, that has inspiredmany agent programming languages such as AgentSpeak(L) [2], JACK [3], Jason [4],Jadex [5], CANPLAN [6], 3APL [7], 2APL [8], and Goal [9], to name a few. Despiteits success, an important drawback <strong>of</strong> the BDI model is the lack <strong>of</strong> a learningability, in that once deployed, BDI agents have no capacity to adapt andimprove their behaviour over time. In this paper, we address this issue in thecontext <strong>of</strong> BDI-like rule-based agent programming languages. Particularly, weextend the Goal agent programming language [9] for practical systems [10, 11]with a new language primitive that supports adaptive modules, i.e., moduleswithin which action choices resulting from programmed rules are learnt overtime. While we have chosen Goal for this study, our approach applies generallyto other rule-based programming languages. We use an <strong>of</strong>f-the-shelf reinforcementlearning [12] mechanism under the hood to implement this functionality.Our aim is to allow agent developers to easily program adaptive behavioursusing a programming model that they are already familiar with, and withouthaving to explicitly delve into machine learning technologies. Our idea is to leveragethe domain knowledge encoded into the agent program, by directly using themental state representation <strong>of</strong> the agent for learning purposes. This has the keybenefits that (i) the programmer need not worry about knowledge representationfor learning as a separate issue from programming; (ii) the programming model148

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!