Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
2.2. Reactive plann<strong>in</strong>g and approaches that mix plann<strong>in</strong>g with execution. 11ronment and chooses an action based on this <strong>in</strong>formation, similar to a policy ona Markov decision process described <strong>in</strong> Section 2.4. These systems have no explicitrepresentation of a global plan and some, such as Brook's subsumption architecture[Brooks 1986] and Agre and Chapman's Pengi [Agre & Chapman 1987],have little or no <strong>in</strong>ternal state. The subsumption architecture arranges reactive subsytems<strong>in</strong>to hierarchies, so that higher-level behaviour is achieved by one systemover-rid<strong>in</strong>g, or subsum<strong>in</strong>g, a more basic system. Neither of these systems, however,shows how a reactive plan could be built from a declarative description ofthe environment. Schoppers [Schoppers 1989a] develops a similar scheme <strong>in</strong> his\universal plans" and describes how they can be created automatically. Other reactiveplann<strong>in</strong>g systems such asprs [George & Lansky 1987], rap [Firby 1987;1989] and hap [Loyall & Bates 1991] do <strong>in</strong>clude <strong>in</strong>ternal state | prs represents theagent's beliefs, desires and <strong>in</strong>tentions explicitly | and also control structures such asiteration.These systems can produce appropriate behaviour <strong>under</strong> dierent execution conditionsbut for each cont<strong>in</strong>gency that arises the appropriate response must be programmedby ahuman <strong>in</strong> advance. In many doma<strong>in</strong>s, however, it is not feasible tospecify the correct action <strong>in</strong> each possible situation <strong>in</strong> advance. Instead we would liketo comb<strong>in</strong>e the speed of a reactive planner <strong>in</strong> familiar situations with the exibilityof a classical planner <strong>in</strong> situations not previously anticipated. Systems have beendeveloped for this purpose that comb<strong>in</strong>e reactive execution with classical plann<strong>in</strong>g,us<strong>in</strong>g the latter when no pre-programmed response is available for some cont<strong>in</strong>gency.Both the Theo-agent [Blythe & Mitchell 1989] and the \anytime synthetic projection"technique of Drummond and Bres<strong>in</strong>a [Drummond & Bres<strong>in</strong>a 1990] followreactive rules if they are present, and otherwise fall back on plann<strong>in</strong>g and then compilethe result<strong>in</strong>g plan <strong>in</strong>to new reactive rules to be used <strong>in</strong> future episodes. The twosystems dier on the action and goal representation and on the plann<strong>in</strong>g technique.The Theo-agent uses a strips-like action representation and backward cha<strong>in</strong><strong>in</strong>g. Inanytime synthetic projection, forward cha<strong>in</strong><strong>in</strong>g is used with a richer action and goallanguage that can specify probabilistic outcomes, exogenous events and goals to ma<strong>in</strong>ta<strong>in</strong>predicates over time <strong>in</strong>tervals.A mixed plann<strong>in</strong>g and execution strategy can provide further advantages if it is<strong>under</strong> explicit control. Some goals can be planned for <strong>in</strong> advance, while others can bedeferred until part way through the execution, when there may be more <strong>in</strong>formationabout the best course of action. By not always giv<strong>in</strong>g priority to reactive rules, themixed-strategy system can avoid pitfalls <strong>in</strong> some cases. On the other hand, delay<strong>in</strong>gplann<strong>in</strong>g for these goals can drastically reduce the number of alternatives to consider.Gervasio shows how to build \completable plans" which are designed to be <strong>in</strong>complete,and amenable to be<strong>in</strong>g further elaborated dur<strong>in</strong>g execution [Gervasio & DeJong1994; Gervasio 1996]. Goodw<strong>in</strong> [Goodw<strong>in</strong> 1994] considers the question of when toswitch from plann<strong>in</strong>g to execut<strong>in</strong>g <strong>in</strong> a time-dependent plann<strong>in</strong>g problem. Onder andPollack [Onder & Pollack 1997] describe a probabilistic planner that reasons aboutwhich cont<strong>in</strong>gencies to plan for before execution and which to defer. Wash<strong>in</strong>gton