Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...

More documents

Recommendations

Info

32 Chapter 3. Planning under Uncertaintybe made to the state when the action or event will take place. There is no distinctionin the history list between actions and events, since once they are added to the statethere is no need for a distinction. These pairs are called pending eects.Figure 3.8 shows the sequence of states in M that is traversed if a deterministicversion of Move-Barge is executed in the beginning state. In each state, literals fromthe state space of the planning problem are shown above the dotted line, while belowthe line is shown a list of pending eects. The initial state and the two nal statesin the diagram have no pending eects, while each intermediate state has exactlyone pending eect. When the action is performed in the left-most state, it gives riseto two possible successor states, with dierent pending eects corresponding to thedierent possible outcomes of the action. Since the length of the action is the samein each outcome, the count-down on the pending eects in each state is the same,3. Whenever the pending eect has a count-down higher than 1, the successor statesimply decrements the count-down. When the count-down reaches 1, the pendingeect is not present in the successor state, but the eects themselves are applied to theportion of the MDP state corresponding to the literals in the planning problem. Thusin the nal states, (at barge1 west-coast) is true and (at barge1 Richmond) isfalse. In the lower of the two nal states, (operational barge1) has become false,while it is still true in the upper nal state.σ= (del (at barge1 Richmond))(add (at barge1 west-coast))(at barge1 Richmond)(operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(at barge1 Richmond)(operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(at barge1 Richmond)(operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(at barge1 west-coast)(operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(at barge1 Richmond)(operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)2/31/33: σ2: σ1: σ(at barge1 Richmond)(operational barge1)(at barge1 Richmond)(operational barge1)(at barge1 Richmond)(operational barge1)(at barge1 west-coast)NOT (operational barge1)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)υ=(del (at barge1 Richmond))(add (at barge1 west-coast))(del (operational barge1))3: υ2: υ1: υFigure 3.8: The two possible sequences of states in the underlying Markov decisionprocess arising from executing the Move-Barge operator in the state shown on theleft in the absence of exogenous events.The pending eects in Figure 3.8 were added by choosing the action Move-Bargein the initial state. Exogenous events are also modelled by adding pending eects tothe state in the MDP. For example, Figure 3.9 shows the states arising from takingthe same action in an initial state that also has no pending eects but includesthe literal (poor-weather) and in a domain that also includes the exogenous eventWeather-Brightens shown in Figure 3.6, but with a duration of 2 time units. Sincethe presence of the exogenous event leads to considerably more states, I only show thedomain-level state features that dier from those of the parent state. The probabilitythat each state is reached is shown above the state. The probabilities of the state
3.3. A Markov decision process model of uncertainty 33transitions are calculated from the event and action outcome probabilities, which areconditionally independent given the states in which the corresponding pending eectsbegan.0.50.3750.2810.211σ= (del (at barge1 Richmond))(add (at barge1 west-coast))3: σ0.752: σ0.751: σ0.75(at barge1 west-coast)ε= (del (poor-weather)).05(add (fair-weather))3: σ 1: ε0.1670.252: σ 1: ε0.1250.251: σ1: ε0.0940.250.07(at barge1 west-coast)1: ε1.0(at barge1 Richmond)(operational barge1)(poor-weather)(speed barge1 1)(ready barge1)(distance Richmond west-coast 4)0.1670.253: υ0.250.75(fair-weather)2: σ2: υ0.1670.1870.751: σ1: υ0.292(fair-weather)0.140.750.386(at barge1 west-coast)(fair-weather)0.1(at barge1 west-coast)NOT (operational barge1)υ= (del (at barge1 Richmond))(add (at barge1 west-coast))(del (operational barge1))0.0830.0833: υ 1: ε0.252: υ2: υ1: ε(fair-weather)0.0630.0830.251: υ1: υ1: ε0.0470.146(fair-weather)0.250.04(at barge1 west-coast)NOT (operational barge1)1: ε0.193(at barge1 west-coast)NOT (operational barge1)(fair-weather)Figure 3.9: The possible sequences of states in the underlying Markov decisionprocess arising from executing the Move-Barge operator in the state shown on theleft, when the exogenous event Weather-Brightens can also take place. Only literalsthat are changed from a parent state are shown.Formal description of the underlying MDPEach state s of the underlying mdp M consists of two parts:1. a truth assignment to the ground literals L, corresponding to a state in theplanning domain and referred to as the literal state .2. a set of pending eects .Each element of the pending eects is a triple (t; a; ) where t is an integer denotinga time interval, a is an operator or exogenous event and is a set of \eects", aset of add and delete statements over ground literals in L. The pending eectsrepresent the events and actions that are currently taking place in the given state ofM. Intuitively, t denotes the time left before the eect \completes" or is applied tochange the state.
Page 1 and 2: Planning under Uncertainty in Dynam
Page 3: AbstractPlanning, the process of nd
Page 6 and 7: 4.3.1 Analysing the belief net and
Page 8 and 9: viii
Page 10 and 11: 7.4 Weaver's solution to the exampl
Page 12 and 13: 3.12 Reachability graph of literal
Page 14 and 15: 6.1 Operators in the parameterised
Page 16 and 17: xvi
Page 18 and 19: xviii
Page 20 and 21: 2 Chapter 1. Introductionif the pri
Page 22 and 23: 4 Chapter 1. Introductionweather co
Page 24 and 25: 6 Chapter 1. Introductionnet nodes
Page 26 and 27: 8 Chapter 1. Introduction
Page 28 and 29: 10 Chapter 2. Related workIn additi
Page 30 and 31: 12 Chapter 2. Related workmakes use
Page 32 and 33: 14 Chapter 2. Related workall the s
Page 34 and 35: 16 Chapter 2. Related workValue-Ite
Page 36 and 37: 18 Chapter 2. Related workdescent [
Page 38 and 39: 20 Chapter 3. Planning under Uncert
Page 62 and 63: 44 Chapter 4. The Weaver Algorithmi
Page 64 and 65: 46 Chapter 4. The Weaver Algorithmn
Page 66 and 67: 48 Chapter 4. The Weaver AlgorithmB
Page 68 and 69: 50 Chapter 4. The Weaver Algorithm
Page 70 and 71: 52 Chapter 4. The Weaver Algorithm0
Page 72 and 73: 54 Chapter 4. The Weaver AlgorithmI
Page 74 and 75: 56 Chapter 4. The Weaver Algorithmd
Page 76 and 77: 58 Chapter 4. The Weaver Algorithm(
Page 78 and 79: 60 Chapter 4. The Weaver AlgorithmT
Page 80 and 81: 62 Chapter 4. The Weaver Algorithmn
Page 82 and 83: 64 Chapter 4. The Weaver Algorithm4
Page 84 and 85: 66 Chapter 4. The Weaver Algorithml
Page 86 and 87: 68 Chapter 4. The Weaver Algorithmc
Page 88 and 89: 70 Chapter 5. Eciency improvements
Page 100 and 101:
82 Chapter 6. Eciency improvements
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
Page 108 and 109:
Page 110 and 111:
Page 112 and 113:
94 Chapter 7. Experimental results
Page 114 and 115:
Page 116 and 117:
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
Page 124 and 125:
Page 126 and 127:
Page 128 and 129:
Page 130 and 131:
Page 132 and 133:
Page 134 and 135:
Page 136 and 137:
118 Chapter 8. Conclusions The appl
Page 138 and 139:
120 Chapter 8. Conclusions
Page 140 and 141:
122 Appendix A. Proofs of theoremso
Page 142 and 143:
124 Appendix A. Proofs of theoremsN
Page 144 and 145:
126 Appendix B. The Oil-spill domai
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
Page 166 and 167:
Page 168 and 169:
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
158 BIBLIOGRAPHY[Blythe & Veloso 19
Page 178 and 179:
160 BIBLIOGRAPHY[Drummond & Bresina
Page 180 and 181:
162 BIBLIOGRAPHY[Koenig & Simmons 1
Page 182 and 183:
164 BIBLIOGRAPHY[Schoppers 1989b] S
Page 184:
166 BIBLIOGRAPHY
show all

Planning under Uncertainty in Dynamic Domains - Carnegie Mellon ...

Create successful ePaper yourself

Delete template?

Save as template?