Here - Agents Lab - University of Nottingham

Recommendations

Info

OriginalAdaptive10410 210 1(a) Program A520 1,000 2,00000 1,000 2,00000 1,000 2,000(b) Program B(c) Program C30810 3 (d) Program A20610 210 10 2,000 4,0001000 2,000 4,0004200 2,000 4,000(e) Program B(f) Program CFig. 2: Comparison of the number of moves (y-axis) over successive episodes (xaxis)to solve randomly generated worlds of four (top row) and six (bottom row)blocks, with original (light shade) and adaptive (dark shade) rule ordering.ResultsFigure 2 shows the results of running the programs A, B, and C, with andwithout adaptive behaviours enabled (in dark and light shading respectively).Figure 2a, Figure 2b, and Figure 2c are results for problems with four blocks,while Figure 2d, Figure 2e, and Figure 2f are for problems with six blocks.Program A: In Figure 2a, the light shading shows that the average numberof steps taken by the original A to solve problems with four blocks is around350 moves. This is not very surprising since A is really only trying to solve theproblem using random moves. The dark shading shows the results for the sameset of problems and the same A, but using adaptive rule ordering. While initiallythe program performs similarly as it tries to find the first few solutions,it improves to around seven moves per problem by 100 episodes. Beyond thatit improves progressively and by the end of the experiment at 2000 episodesthe program takes around five moves per problem. Compared to our baselineprogram that averages 2(n − √ n) = 4, i.e., four moves for a problem with fourblocks, we can already see that the learnt ordering gives competitive perfor-159
mance. Figure 2d shows the same A for problems with six blocks using adaptiveordering. We have not included results for the original program since it takesover 30000 moves on average per problem. For adaptive mode, this number improvesto about 60 moves by the 100th episode, and progressively to around 12moves by 4000 episodes. This gets us close to the baseline of 2(n − √ n) = 7.1but not quite there. It would be possible to improve further if the program wasallowed to run for more episodes, but the improvement will occur very slowly.We also did not run this program for problems with more than six blocks assolving larger problems becomes impractical with this strategy.Program B: Figure 2b shows the performance of the original B for problemswith four blocks at around 11 moves. The performance is already reasonable tostart with as it is a more informed programmed strategy than A. With adaptiveordering, the performance improves to around five moves per problem by 100episodes. This is on par with the performance of A at 2000 episodes. At the endof the experiment, the program performs slightly above 4.5 moves and is closeto optimal. For six blocks, the original program averages around 28 moves perproblem as shown in Figure 2e. In adaptive mode, this improves to around 58moves by 100 episodes, and at the end of the experiment to around 10 moves.This is higher than the baseline of 7.1 moves but slightly better than adaptiveperformance with A that averages around 12 moves in that timeframe. Overall,B performs far better than A due to its informed strategy, and this performancealso translates to faster and better learning.Program C : In contrast to the other programs, C is already known to performclose to optimal, and achieves around 4.5 moves on average per problem of fourblocks as shown in Figure 2c. With adaptive ordering, this does not seem toimprove in the 2000 episodes that we ran the experiment for. This is expectedsince the program is already performing close to optimal. However, interestinglywe know from previous studies that C does not perform optimally for certain“deadlock” cases. We would have hoped to overcome this using learning but fromthe averaged results this is not evident as there is no significant difference in theperformance with and without learning. Importantly, for six blocks, for the firsttime in the experiments we see that the adaptive ordering actually performsworse than the original program in Figure 2f, albeit by only 0.25 moves perproblem on average at its worst. On closer analysis this seems to be becausewe simply have not run the experiment long enough. Certainly the differencebetween the two modes of execution is diminishing as the experiment progressesand is evident in Figure 2f. We should note that regardless, the performanceof C with or without learning is significantly better than the other programs ataround 8 moves and only slightly higher than the baseline case of 2(n− √ n) = 7.1.Overall, we can conclude that C is already very informed about the domain, solearning is not very useful in this case.Interestingly, in all experiments, adaptive mode does not do any worse thanthe default behaviour. This is a useful insight for agent programmers who mayotherwise feel reluctant to try a “black box” technology that directly impactsthe performance of the agent but that they do not really understand. Anotherimportant point is that the performance improvement with adaptive mode is160
Page 2 and 3:
Proceedings of the Tenth Internatio
Page 4 and 5:
OrganisationOrganising CommitteeMeh
Page 6:
Table of ContentseJason: an impleme
Page 10 and 11:
in Sect. 3 the translation of the J
Page 12 and 13:
init_count(0).max_count(2000).(a)(b
Page 14 and 15:
For instance, a failure in the ERES
Page 16 and 17:
{plan, fun start_count_trigger/1,fu
Page 18 and 19:
single parameter, an Erlang record
Page 20 and 21:
1. Belief annotations. Even though
Page 22 and 23:
decisions taken during the design a
Page 24 and 25:
Conceptual Integration of Agents wi
Page 26 and 27:
Fig. 2. Active component structurep
Page 28 and 29:
the service provider component. As
Page 30 and 31:
Fig. 4. Web Service Invocationretri
Page 32 and 33:
01: public interface IBankingServic
Page 34 and 35:
tate them in the same way as in the
Page 36 and 37:
01: public interface IChartService
Page 38 and 39:
implementations being available for
Page 41:
deliberative behavior in BDI archit
Page 44 and 45:
layer modules (i.e. nodes) can be d
Page 46 and 47:
different methods to choose the cur
Page 48 and 49:
also a single scheduler module, imp
Page 50 and 51:
andom choice (OR), conditional choi
Page 52 and 53:
- Dealing with conflicts based on p
Page 54 and 55:
5. Brooks, R. A. (1991) Intelligenc
Page 56 and 57:
An Agent-Based Cognitive Robot Arch
Page 58 and 59:
It has been argued that building ro
Page 60 and 61:
EnvironmentHardwareLocal SoftwareC+
Page 62 and 63:
a cognitive layer can connect as a
Page 64 and 65:
can reliably be differentiated and
Page 66 and 67:
4 ExperimentTo evaluate the feasibi
Page 68 and 69:
learn or gain knowledge from experi
Page 70 and 71:
A Programming Framework for Multi-A
Page 72 and 73:
exchange and storage of tuples (key
Page 74 and 75:
Although some success [13] [14] hav
Page 76 and 77:
as well as important non-functional
Page 78 and 79:
component plans have been instantia
Page 80 and 81:
A in the example) can evaluate all
Page 83 and 84:
1. robot-1 issues a Localization(ro
Page 85 and 86:
ACKNOWLEDGMENTThis work has been su
Page 87 and 88:
The code was analysed both objectiv
Page 89 and 90:
a conversation is following. Additi
Page 91 and 92:
the context of a communication-heav
Page 93 and 94:
Table 1. Core Agent ProtocolsAgent
Page 95 and 96:
statistically significant using an
Page 97 and 98:
to the conversation and has a perfo
Page 99 and 100:
principal reasons. Firstly, it is a
Page 101 and 102:
2. Muldoon, C., O’Hare, G.M.P., C
Page 103 and 104:
In the following section we will at
Page 105 and 106:
DevelopmentProductionHuman Readabil
Page 107 and 108:
will then evaluate this new format
Page 109 and 110: encoder, it is first checked if the
Page 111 and 112: nents themselves. However, since th
Page 113 and 114: optimized for this format feature s
Page 115 and 116: Java serialization and Jadex Binary
Page 117 and 118: 10. P. Hoffman and F. Yergeau, “U
Page 119 and 120: Caching the results of previous que
Page 121 and 122: querying an agent’s beliefs and g
Page 123 and 124: or relative performance of each pla
Page 125 and 126: were run for 1.5 minutes; 1.5 minut
Page 127 and 128: Size N K n p c qry U c upd Update c
Page 129 and 130: epresentation. The cache simply act
Page 131 and 132: 6 ConclusionWe presented an abstrac
Page 133 and 134: Typing Multi-Agent Programs in simp
Page 135 and 136: 1 // agent ag02 iterations (" zero
Page 137 and 138: 3.1 simpAL OverviewThe main inspira
Page 139 and 140: 3.2 Typing Agents with Tasks and Ro
Page 141 and 142: Defining Agent Scripts in simpAL (F
Page 143 and 144: that sends a message to the receive
Page 145 and 146: * error: wrong type for the param v
Page 147 and 148: Given an organization model, it is
Page 149 and 150: Learning to Improve Agent Behaviour
Page 151 and 152: 2.1 Agent Programming LanguagesAgen
Page 153 and 154: choosing actions is to find a good
Page 155 and 156: 1 init module {2 knowledge{3 block(
Page 157 and 158: of a module. For example, to change
Page 159: if bel(on(X,Y), clear(X)), a-goal(c
Page 163 and 164: the current percepts of the agent.
Page 165: Author IndexAbdel-Naby, S., 69Alelc
show all

Here - Agents Lab - University of Nottingham

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?