23.08.2015 Views

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.1 Agent Programming LanguagesAgent programming languages (APLs) based on the BDI paradigm are rulebasedlanguages [15, 16]. Rules may serve various purposes but are used amongothers to select the actions or plans that an agent will perform. An agent programmay perform built-in actions that are provided as programming constructs thatare part <strong>of</strong> the language itself or it may perform actions that are available inan environment that the agent is connected to. Environment actions give theagent some control over the changes that occur in that environment. The types<strong>of</strong> rules that are used in APLs varies. Generally speaking, however, rules have acondition that is evaluated on the agent’s mental state (rule head) and have acorresponding action or plan that is instantiated if the rule fired (rule body).Rules are evaluated and applied in a reasoning cycle that is part <strong>of</strong> the agentprogram’s interpreter. Agent interpreters for APLs implement a sense-plan-actcycle or a variant there<strong>of</strong>. In a typical interpreter, for example, the percepts receivedfrom an environment are processed by some predefined mechanism. Most<strong>of</strong>ten this is an automatic mechanism (that may be customisable as in Jason) [4],but not necessarily; in Goal, for example, so-called percept rules are availablefor processing incoming percepts and similar rules are available for processingmessages received from other agents (2APL has similar rules [8]). During thisstage, either before or after processing the percepts, typically the messages receivedfrom other agents are processed. These steps are usually performed firstto ensure the mental state <strong>of</strong> the agent is up to date. Thereafter, the interpreterwill evaluate rules against the updated mental state and select applicable rules.After determining which rules are applicable one or more <strong>of</strong> these rules is fired,resulting in one or more options to perform an action or add a plan to a planbase. Some selection mechanism (that again may be customised as e.g. in Jason)then is used to arbitrate between these multiple options, or, as is the case in forexample Goal, a choice is randomly made. Finally, an action is executed eitherinternally or an action is sent to the environment for execution.One aspect <strong>of</strong> these interpreters is that they may generate multiple applicablerules and options for performing actions. If multiple options are available, thenthe agent program is underspecified in the sense that it does not determine aunique choice <strong>of</strong> action. It is this feature <strong>of</strong> agent architectures that we willexploit and can be used by a learning mechanism for optimising the agent’schoice <strong>of</strong> action [17, 18].2.2 Reinforcement LearningReinforcement learning [12] is a formal framework for optimally solving multistagedecision problems in environments where outcomes are only partly attributedto decision making by the agent and are partly stochastic. The generalidea is to describe the value <strong>of</strong> a decision problem at a given time step in terms<strong>of</strong> the pay<strong>of</strong>fs received from choices made so far, and the value <strong>of</strong> the remainingproblem that results from those initial choices. Formally, at each time step t ina multistep problem, the agent perceives state s ∈ S <strong>of</strong> the environment and150

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!