Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.2. A representation for plann<strong>in</strong>g <strong>under</strong> uncerta<strong>in</strong>ty 27cannot both be true at the same time. However s<strong>in</strong>ce any state that is reachableby a sequence of actions <strong>in</strong> the doma<strong>in</strong> from a valid <strong>in</strong>itial state will also be valid,these <strong>in</strong>valid states are not an issue <strong>in</strong> practice. This state property ofvalidity couldbe partially derived by consider<strong>in</strong>g the states reachable by apply<strong>in</strong>g actions to anyphysically possible <strong>in</strong>itial state, or it could be enforced by add<strong>in</strong>g doma<strong>in</strong> axioms suchas those used <strong>in</strong> [Knoblock 1991]. In what follows I will ignore this dist<strong>in</strong>ction.In Weaver the plann<strong>in</strong>g doma<strong>in</strong> is generalised as follows.1. The operators <strong>in</strong> the doma<strong>in</strong> <strong>in</strong>clude a duration which isan<strong>in</strong>teger-valued functionof the b<strong>in</strong>d<strong>in</strong>gs of the operator (and therefore a <strong>in</strong>teger for an <strong>in</strong>stantiatedaction or step). This <strong>in</strong>teger may represent any time unit, for example secondsor hours, although the unit must be the same for dierent operators <strong>in</strong> the samedoma<strong>in</strong>.2. Operators may specify a discrete, conditional probability distribution of possibleoutcomes rather than the s<strong>in</strong>gle possible outcome used <strong>in</strong> prodigy 4.0. Anexample of this will be described <strong>in</strong> more detail below.3. A plann<strong>in</strong>g doma<strong>in</strong> <strong>in</strong>cludes a set of exogenous events E as well as the set ofoperators O. These are syntactically very similar to operators but are used tospecify the way that the world can change <strong>in</strong>dependently of the actions taken <strong>in</strong>a plan, as I describe below. For example, they can be used to model the actionsof other agents or natural processes.4. A total precedence order < is given over the actions and events. This is used toresolve conicts between their eects if more than one action or event produceschanges to a state. An example is given below.Weaver generalises prodigy 4.0's denition of a plann<strong>in</strong>g problem by specify<strong>in</strong>ga probability distribution of possible <strong>in</strong>itial states rather than a s<strong>in</strong>gle <strong>in</strong>itial state.The problem also <strong>in</strong>cludes a threshold probability, , a m<strong>in</strong>imum probability of successthat a plan must equal or exceed to be considered a solution. The objects and goalstatement <strong>in</strong> the plann<strong>in</strong>g problem are unchanged.In the rest of this section I make the semantics of plann<strong>in</strong>g doma<strong>in</strong>s and problems<strong>in</strong> Weaver precise <strong>in</strong> terms of an <strong>under</strong>ly<strong>in</strong>g Markov decision process M dened bya plann<strong>in</strong>g problem. While this denition is needed to prove that Weaver correctlycomputes probabilities for plans and to discuss its coverage, on a casual read<strong>in</strong>g of thethesis it can be skipped and replaced with the follow<strong>in</strong>g summary: at each time step,several events may take place simultaneously with one action as a plan is executed.When more than one event or action complete <strong>in</strong> one time step, their results areapplied to the state <strong>in</strong> parallel. If more than one possible value is specied for someground literal <strong>in</strong> the state, the value nom<strong>in</strong>ated by the event or action that is highest<strong>in</strong> the pre-specied precedence order is used. Actions are usually higher than events<strong>in</strong> the precedence order.