Here - Agents Lab - University of Nottingham

Recommendations

Info

for adaptive behaviours remains the same as before; and (iii) learning and programmingbecome truly integrated in the sense that their effectiveness dependsdirectly on the mental state representation used by the programmer.The key idea is that learning may exploit the underspecification that is inherentin agent programming [13]. That is, agent programs often generate multipleoptions for actions without specifying how to make a final choice between theseoptions. This is a feature of agent programming because it does not require aprogrammer to specify action selection to unnatural levels of detail. The motivationof our work is to exploit this underspecification and potentially optimizeaction selection by means of automated learning where it may be too complicatedfor a programmer to optimize code. The first challenge is to add a learningmechanism to agent programming in a generic and flexible way and to naturallyintegrate such a mechanism in a way that burdens the programmer minimally.The second challenge is to do this in such a way that the state space to beexplored by the learning mechanism can still be managed by the program. Ourapproach addresses both challenges by re-using the mental state representationavailable in the agent program. Although our approach also facilitates managingthe state space, there remain issues for future work that need to be dealt within this area in particular. To this end, we draw some lessons learned from ourwork and discuss some options for dealing with this issue.One of the aims of our work is to explore the impact of various representationsor program choices on the learning mechanism. Even though our objectiveis to impose minimal requirements on the programmer’s knowledge of machinelearning, the program structure will have impact on the learning performance.Ideally, therefore, we can give the programmer some guidelines on how to writeagent programs that are able to effectively learn. It is well-known that the representationlanguage is a crucial parameter in machine learning. Given an adequatelanguage, learning will be effective, and given an inadequate one learning willbe difficult if not impossible [14]. Applied to agent-oriented programming thismeans that it is important to specify the right predicates for coding the agent’smental state and to provide the right (modular) program structure to enhancethe effectiveness of learning. If the programmer is not able to use knowledge toguide program design, one may have to search a larger space, may require moreexamples and time, and in the worst case, learning might be unsuccessful.The remainder of the paper is as follows. Section 2 introduces the Goallanguage and reinforcement learning, followed by an overview of related worksin Section 3. Section 4 describes the integration of Goal and reinforcementlearning and Section 5 presents experiments in the Blocks World. We concludewith a discussion of limitations and future directions of this work in Section 6.2 PreliminariesWe now briefly discuss how agent programs and cognitive architectures selectthe action to perform next. In other words, we discuss how the mechanism wewant to extend with a learning capability works. Following this, we introducethe reinforcement learning framework that we have used in this work.149
2.1 Agent Programming LanguagesAgent programming languages (APLs) based on the BDI paradigm are rulebasedlanguages [15, 16]. Rules may serve various purposes but are used amongothers to select the actions or plans that an agent will perform. An agent programmay perform built-in actions that are provided as programming constructs thatare part of the language itself or it may perform actions that are available inan environment that the agent is connected to. Environment actions give theagent some control over the changes that occur in that environment. The typesof rules that are used in APLs varies. Generally speaking, however, rules have acondition that is evaluated on the agent’s mental state (rule head) and have acorresponding action or plan that is instantiated if the rule fired (rule body).Rules are evaluated and applied in a reasoning cycle that is part of the agentprogram’s interpreter. Agent interpreters for APLs implement a sense-plan-actcycle or a variant thereof. In a typical interpreter, for example, the percepts receivedfrom an environment are processed by some predefined mechanism. Mostoften this is an automatic mechanism (that may be customisable as in Jason) [4],but not necessarily; in Goal, for example, so-called percept rules are availablefor processing incoming percepts and similar rules are available for processingmessages received from other agents (2APL has similar rules [8]). During thisstage, either before or after processing the percepts, typically the messages receivedfrom other agents are processed. These steps are usually performed firstto ensure the mental state of the agent is up to date. Thereafter, the interpreterwill evaluate rules against the updated mental state and select applicable rules.After determining which rules are applicable one or more of these rules is fired,resulting in one or more options to perform an action or add a plan to a planbase. Some selection mechanism (that again may be customised as e.g. in Jason)then is used to arbitrate between these multiple options, or, as is the case in forexample Goal, a choice is randomly made. Finally, an action is executed eitherinternally or an action is sent to the environment for execution.One aspect of these interpreters is that they may generate multiple applicablerules and options for performing actions. If multiple options are available, thenthe agent program is underspecified in the sense that it does not determine aunique choice of action. It is this feature of agent architectures that we willexploit and can be used by a learning mechanism for optimising the agent’schoice of action [17, 18].2.2 Reinforcement LearningReinforcement learning [12] is a formal framework for optimally solving multistagedecision problems in environments where outcomes are only partly attributedto decision making by the agent and are partly stochastic. The generalidea is to describe the value of a decision problem at a given time step in termsof the payoffs received from choices made so far, and the value of the remainingproblem that results from those initial choices. Formally, at each time step t ina multistep problem, the agent perceives state s ∈ S of the environment and150
Page 2 and 3:
Proceedings of the Tenth Internatio
Page 4 and 5:
OrganisationOrganising CommitteeMeh
Page 6:
Table of ContentseJason: an impleme
Page 10 and 11:
in Sect. 3 the translation of the J
Page 12 and 13:
init_count(0).max_count(2000).(a)(b
Page 14 and 15:
For instance, a failure in the ERES
Page 16 and 17:
{plan, fun start_count_trigger/1,fu
Page 18 and 19:
single parameter, an Erlang record
Page 20 and 21:
1. Belief annotations. Even though
Page 22 and 23:
decisions taken during the design a
Page 24 and 25:
Conceptual Integration of Agents wi
Page 26 and 27:
Fig. 2. Active component structurep
Page 28 and 29:
the service provider component. As
Page 30 and 31:
Fig. 4. Web Service Invocationretri
Page 32 and 33:
01: public interface IBankingServic
Page 34 and 35:
tate them in the same way as in the
Page 36 and 37:
01: public interface IChartService
Page 38 and 39:
implementations being available for
Page 41:
deliberative behavior in BDI archit
Page 44 and 45:
layer modules (i.e. nodes) can be d
Page 46 and 47:
different methods to choose the cur
Page 48 and 49:
also a single scheduler module, imp
Page 50 and 51:
andom choice (OR), conditional choi
Page 52 and 53:
- Dealing with conflicts based on p
Page 54 and 55:
5. Brooks, R. A. (1991) Intelligenc
Page 56 and 57:
An Agent-Based Cognitive Robot Arch
Page 58 and 59:
It has been argued that building ro
Page 60 and 61:
EnvironmentHardwareLocal SoftwareC+
Page 62 and 63:
a cognitive layer can connect as a
Page 64 and 65:
can reliably be differentiated and
Page 66 and 67:
4 ExperimentTo evaluate the feasibi
Page 68 and 69:
learn or gain knowledge from experi
Page 70 and 71:
A Programming Framework for Multi-A
Page 72 and 73:
exchange and storage of tuples (key
Page 74 and 75:
Although some success [13] [14] hav
Page 76 and 77:
as well as important non-functional
Page 78 and 79:
component plans have been instantia
Page 80 and 81:
A in the example) can evaluate all
Page 83 and 84:
1. robot-1 issues a Localization(ro
Page 85 and 86:
ACKNOWLEDGMENTThis work has been su
Page 87 and 88:
The code was analysed both objectiv
Page 89 and 90:
a conversation is following. Additi
Page 91 and 92:
the context of a communication-heav
Page 93 and 94:
Table 1. Core Agent ProtocolsAgent
Page 95 and 96:
statistically significant using an
Page 97 and 98:
to the conversation and has a perfo
Page 99 and 100: principal reasons. Firstly, it is a
Page 101 and 102: 2. Muldoon, C., O’Hare, G.M.P., C
Page 103 and 104: In the following section we will at
Page 105 and 106: DevelopmentProductionHuman Readabil
Page 107 and 108: will then evaluate this new format
Page 109 and 110: encoder, it is first checked if the
Page 111 and 112: nents themselves. However, since th
Page 113 and 114: optimized for this format feature s
Page 115 and 116: Java serialization and Jadex Binary
Page 117 and 118: 10. P. Hoffman and F. Yergeau, “U
Page 119 and 120: Caching the results of previous que
Page 121 and 122: querying an agent’s beliefs and g
Page 123 and 124: or relative performance of each pla
Page 125 and 126: were run for 1.5 minutes; 1.5 minut
Page 127 and 128: Size N K n p c qry U c upd Update c
Page 129 and 130: epresentation. The cache simply act
Page 131 and 132: 6 ConclusionWe presented an abstrac
Page 133 and 134: Typing Multi-Agent Programs in simp
Page 135 and 136: 1 // agent ag02 iterations (" zero
Page 137 and 138: 3.1 simpAL OverviewThe main inspira
Page 139 and 140: 3.2 Typing Agents with Tasks and Ro
Page 141 and 142: Defining Agent Scripts in simpAL (F
Page 143 and 144: that sends a message to the receive
Page 145 and 146: * error: wrong type for the param v
Page 147 and 148: Given an organization model, it is
Page 149: Learning to Improve Agent Behaviour
Page 153 and 154: choosing actions is to find a good
Page 155 and 156: 1 init module {2 knowledge{3 block(
Page 157 and 158: of a module. For example, to change
Page 159 and 160: if bel(on(X,Y), clear(X)), a-goal(c
Page 161 and 162: mance. Figure 2d shows the same A f
Page 163 and 164: the current percepts of the agent.
Page 165: Author IndexAbdel-Naby, S., 69Alelc
show all

Here - Agents Lab - University of Nottingham

Create successful ePaper yourself

Delete template?

Save as template?