Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...

More documents

Recommendations

Info

10 Chapter 2. Related workIn addition to these assumption, classical planners return a plan in the form of asequence of actions, either totally or partially ordered.This is a much studied family in AI planning because the assumptions are powerfuland seem reasonable. An interesting range of behaviours can be captured withstrips-style operators and they allow a planner to perform backward chaining sincegiven a goal, one can subgoal on the preconditions required for an action or sequenceactions to achieve the goal. In addition the logical representation allows a plan to beproved correct. Prodigy, the planning system extended in this thesis, is a classicalplanner according to this denition [Veloso et al. 1995]. A more detailed description ofProdigy's action representation, which is an example of a strips-like representation,can be found in Section 3.1.Work in classical planning has typically focussed on improving the eciency withwhich a plan is created. For example partial-order planners, introduced in the lateeighties [Chapman 1987; McAllester & Rosenblitt 1991; Penberthy &Weld 1992],attempt to reduce the search space using a \least commitment" principle where stepsin a plan are only ordered with respect to one another as required to prove that theplan is successful. Planners can be made faster by solving a series of abstractions of theplanning problem, each more detailed until the problem is solved in full complexity,at each stage using the solution from the previous stage [Sacerdoti 1974; Yang &Tenenberg 1990; Knoblock 1991]. Work has also been done on controlling search withvarious heuristics [Blythe & Veloso 1992; Gerevini & Schubert 1996; Pollack, Joslin,&Paolucci 1997] and with explicit control rules [Minton 1988; Minton et al. 1989].The strips action representation does not support non-deterministic action outcomesand classical planners do not have a means to represent sources of change inthe domain other than the actions taken by the performance agent. These systemstherefore cannot be used for planning problems involving uncertainty. However theideas and algorithms developed under the assumptions of classical planning form thebasis of many approaches to planning under uncertainty including the ones describedin this thesis.2.2 Reactive planning and approaches that mixplanning with execution.When a plan is executed in a stochastic domain in which initial conditions and actionoutcomes are uncertain, many dierent scenarios can take place as a result of theexecution. A closed-loop plan, one that contains no sensing and always executesthe same set of actions, will usually be too brittle to achieve high reliability in sucha domain. Instead, planning systems must create branching plans that take intoaccount the intermediate states reached while the plan is being executed and takedierent actions accordingly.Some approaches to this problem, motivated in addition by real-time constraints,aim to create strategies for behaviour in which a system repeatedly senses its envi-
2.2. Reactive planning and approaches that mix planning with execution. 11ronment and chooses an action based on this information, similar to a policy ona Markov decision process described in Section 2.4. These systems have no explicitrepresentation of a global plan and some, such as Brook's subsumption architecture[Brooks 1986] and Agre and Chapman's Pengi [Agre & Chapman 1987],have little or no internal state. The subsumption architecture arranges reactive subsytemsinto hierarchies, so that higher-level behaviour is achieved by one systemover-riding, or subsuming, a more basic system. Neither of these systems, however,shows how a reactive plan could be built from a declarative description ofthe environment. Schoppers [Schoppers 1989a] develops a similar scheme in his\universal plans" and describes how they can be created automatically. Other reactiveplanning systems such asprs [George & Lansky 1987], rap [Firby 1987;1989] and hap [Loyall & Bates 1991] do include internal state | prs represents theagent's beliefs, desires and intentions explicitly | and also control structures such asiteration.These systems can produce appropriate behaviour under dierent execution conditionsbut for each contingency that arises the appropriate response must be programmedby ahuman in advance. In many domains, however, it is not feasible tospecify the correct action in each possible situation in advance. Instead we would liketo combine the speed of a reactive planner in familiar situations with the exibilityof a classical planner in situations not previously anticipated. Systems have beendeveloped for this purpose that combine reactive execution with classical planning,using the latter when no pre-programmed response is available for some contingency.Both the Theo-agent [Blythe & Mitchell 1989] and the \anytime synthetic projection"technique of Drummond and Bresina [Drummond & Bresina 1990] followreactive rules if they are present, and otherwise fall back on planning and then compilethe resulting plan into new reactive rules to be used in future episodes. The twosystems dier on the action and goal representation and on the planning technique.The Theo-agent uses a strips-like action representation and backward chaining. Inanytime synthetic projection, forward chaining is used with a richer action and goallanguage that can specify probabilistic outcomes, exogenous events and goals to maintainpredicates over time intervals.A mixed planning and execution strategy can provide further advantages if it isunder explicit control. Some goals can be planned for in advance, while others can bedeferred until part way through the execution, when there may be more informationabout the best course of action. By not always giving priority to reactive rules, themixed-strategy system can avoid pitfalls in some cases. On the other hand, delayingplanning for these goals can drastically reduce the number of alternatives to consider.Gervasio shows how to build \completable plans" which are designed to be incomplete,and amenable to being further elaborated during execution [Gervasio & DeJong1994; Gervasio 1996]. Goodwin [Goodwin 1994] considers the question of when toswitch from planning to executing in a time-dependent planning problem. Onder andPollack [Onder & Pollack 1997] describe a probabilistic planner that reasons aboutwhich contingencies to plan for before execution and which to defer. Washington
Page 1 and 2: Planning under Uncertainty in Dynam
Page 3: AbstractPlanning, the process of nd
Page 6 and 7: 4.3.1 Analysing the belief net and
Page 8 and 9: viii
Page 10 and 11: 7.4 Weaver's solution to the exampl
Page 12 and 13: 3.12 Reachability graph of literal
Page 14 and 15: 6.1 Operators in the parameterised
Page 16 and 17: xvi
Page 18 and 19: xviii
Page 20 and 21: 2 Chapter 1. Introductionif the pri
Page 22 and 23: 4 Chapter 1. Introductionweather co
Page 24 and 25: 6 Chapter 1. Introductionnet nodes
Page 26 and 27: 8 Chapter 1. Introduction
Page 30 and 31: 12 Chapter 2. Related workmakes use
Page 32 and 33: 14 Chapter 2. Related workall the s
Page 34 and 35: 16 Chapter 2. Related workValue-Ite
Page 36 and 37: 18 Chapter 2. Related workdescent [
Page 38 and 39: 20 Chapter 3. Planning under Uncert
Page 62 and 63: 44 Chapter 4. The Weaver Algorithmi
Page 64 and 65: 46 Chapter 4. The Weaver Algorithmn
Page 66 and 67: 48 Chapter 4. The Weaver AlgorithmB
Page 68 and 69: 50 Chapter 4. The Weaver Algorithm
Page 70 and 71: 52 Chapter 4. The Weaver Algorithm0
Page 72 and 73: 54 Chapter 4. The Weaver AlgorithmI
Page 74 and 75: 56 Chapter 4. The Weaver Algorithmd
Page 76 and 77: 58 Chapter 4. The Weaver Algorithm(
Page 78 and 79:
60 Chapter 4. The Weaver AlgorithmT
Page 80 and 81:
62 Chapter 4. The Weaver Algorithmn
Page 82 and 83:
64 Chapter 4. The Weaver Algorithm4
Page 84 and 85:
66 Chapter 4. The Weaver Algorithml
Page 86 and 87:
68 Chapter 4. The Weaver Algorithmc
Page 88 and 89:
70 Chapter 5. Eciency improvements
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
94 Chapter 7. Experimental results
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
118 Chapter 8. Conclusions The appl
Page 138 and 139:
120 Chapter 8. Conclusions
Page 140 and 141:
122 Appendix A. Proofs of theoremso
Page 142 and 143:
124 Appendix A. Proofs of theoremsN
Page 144 and 145:
126 Appendix B. The Oil-spill domai
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
Page 166 and 167:
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
158 BIBLIOGRAPHY[Blythe & Veloso 19
Page 178 and 179:
160 BIBLIOGRAPHY[Drummond & Bresina
Page 180 and 181:
162 BIBLIOGRAPHY[Koenig & Simmons 1
Page 182 and 183:
164 BIBLIOGRAPHY[Schoppers 1989b] S
Page 184:
166 BIBLIOGRAPHY
show all

Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?