23.08.2015 Views

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

Here - Agents Lab - University of Nottingham

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5 Experiments<strong>Here</strong> we describe the Blocks World domain that we used as a testbed for ourexperiments, and then the three different programs to solve it. We analyse theresults quantitatively in terms <strong>of</strong> the average number <strong>of</strong> steps taken by theagent to achieve its goal, as well as qualitatively in terms <strong>of</strong> how the design <strong>of</strong>the program impacts learning performance.We have chosen the Blocks World domain for our experiments for severalreasons. First, the domain is simple to understand and programming strategiesare easy to describe and compare at a conceptual level. Second, despite itssimplicity, finding optimal solutions in this domain is known to be an NP-hardproblem [34]. Finally, decisions in this domain <strong>of</strong>ten involve choosing betweenseveral options that could potentially be optimised using learning.There are various ways <strong>of</strong> programming a strategy for solving the BlocksWorld. For example, one way would be to dismantle all blocks onto the tableone by one, and then stack them into the desired configuration from there. Thisis in fact a reasonable “baseline” strategy because it is easy to see that theupper bound for the number <strong>of</strong> steps needed to solve a problem with n blocks is2(n − 1) which is the case when one must dismantle a single tower (which takesn − 1 moves for a tower <strong>of</strong> height n) to construct a different single tower (thattakes another n − 1 moves). The average number <strong>of</strong> steps for this algorithmis less intuitive but has been shown to be 2(n − √ n) [33]. For this work, wewill compare three other solutions to the problem, and see how they compareamongst themselves and against this baseline strategy.Program A A very simple strategy for solving the Blocks World is to randomlyselect some block that is clear and move it to some randomly chosen placeon top <strong>of</strong> another object. Effectively, this strategy tries to achieve the finalconfiguration by randomly changing the current configuration for as long asneeded until it eventually stumbles upon the solution. This strategy is givenby the program listing in Figure 1, and is contained in the following codesegment:main module { program[order=random] {if bel(true) then move(X,Y).}}This is certainly not the most effective way to solve the problem, and whileit works reasonably well for small problems <strong>of</strong> two to four blocks, it quicklybecomes unusable beyond six blocks. Nevertheless it is useful for this studysince we are interested in improving action selection using learning, and onewould imagine there is a lot <strong>of</strong> room for improvment in this strategy.Program B An improvement on the random strategy is this actual BlocksWorld program written in Goal by an agent programmer:main module { program[order=random] {if bel(on(X,Y), clear(X), clear(Z)), a-goal(on(X,Z)) then move(X,Z).if bel(on(X,Y), not(clear(X))), a-goal(on(X,Z)) then adopt(clear(X)).if a-goal(on(X,Z)), bel(on(X,Y), not(clear(Z))) then adopt(clear(Z)).157

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!