16.11.2014 Views

Numerical Methods for Optimal Control Problems. Part I ... - Cnr

Numerical Methods for Optimal Control Problems. Part I ... - Cnr

Numerical Methods for Optimal Control Problems. Part I ... - Cnr

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> <strong>Problems</strong>.<br />

<strong>Part</strong> I: Hamilton-Jacobi-Bellman Equations and<br />

Pontryagin Minimum Principle<br />

Ph.D. course in OPTIMAL CONTROL<br />

Emiliano Cristiani (IAC–CNR)<br />

e.cristiani@iac.cnr.it<br />

March 2013<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 1 / 18


1 References<br />

2 <strong>Optimal</strong> control problems<br />

3 Finite horizon problem<br />

HJB approach<br />

Find optimal trajectories from HJB<br />

PMP approach<br />

HJB↔PMP connection<br />

<strong>Numerical</strong> approximation<br />

Direct methods<br />

HJB<br />

PMP<br />

Exploiting HJB↔PMP connection<br />

4 Minimum time problem with target<br />

HJB approach<br />

PMP approach<br />

Numerics <strong>for</strong> HJB<br />

Numerics <strong>for</strong> PMP<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 2 / 18


References<br />

Main references<br />

1 M. Bardi, I. Capuzzo Dolcetta, <strong>Optimal</strong> control and viscosity solutions<br />

of Hamilton-Jacobi-Bellman equations, Birkhäuser, Boston, 1997.<br />

2 J. T. Betts, Practical methods <strong>for</strong> optimal control and estimation<br />

using nonlinear programming, SIAM, 2010.<br />

3 E. Cristiani, P. Martinon, Initialization of the shooting method via the<br />

Hamilton-Jacobi-Bellman approach, J. Optim. Theory Appl., 146<br />

(2010), 321–346.<br />

4 L. C. Evans, An introduction to mathematical optimal control theory,<br />

http://math.berkeley.edu/∼evans/<br />

5 E. Trélat, Contrôle optimal: théorie et applications,<br />

http://www.ljll.math.upmc.fr/∼trelat/<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 3 / 18


<strong>Optimal</strong> control problems<br />

Introduction<br />

<strong>Control</strong>led nonlinear dynamical system<br />

{ ẏ(s) = f (y(s), α(s)), s > t<br />

y(t) = x ∈ R n<br />

Solution:<br />

y x,α (s)<br />

Admissible controls: α ∈ A := {α : [t, +∞) → A},<br />

A ⊂ R m<br />

Regularity assumptions<br />

Are they meaningful from the numerical point of view? Discussion.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 4 / 18


Payoff<br />

<strong>Optimal</strong> control problems<br />

Infinite horizon problem<br />

J x,t [α] =<br />

Finite horizon problem<br />

Target problem<br />

J x,t [α] =<br />

J x,t [α] =<br />

∫ τ<br />

t<br />

∫ ∞<br />

t<br />

∫ T<br />

t<br />

max J x,t[α]<br />

α∈A<br />

(<br />

)<br />

r y x,α (s), α(s) e −µs ds, µ > 0<br />

(<br />

)<br />

r y x,α (s), α(s) ds + g(y x,α (T ))<br />

(<br />

)<br />

r y x,α (s), α(s) ds, τ := min{s : y x,α (s) ∈ T }<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 5 / 18


Finite horizon problem<br />

HJB approach<br />

HJB equation<br />

Value function<br />

Theorem (HJB equation)<br />

v(x, t) := max<br />

α∈A J x,t[α], x ∈ R n , t ∈ [0, T ]<br />

Assume that v ∈ C 1 . Then v solves<br />

v t (x, t) + max<br />

a∈A {f (x, a) · ∇ xv(x, t) + r(x, a)} = 0, x ∈ R n , t ∈ [0, T )<br />

with the terminal condition<br />

v(x, T ) = g(x),<br />

x ∈ R n<br />

sup{J}=-inf{-J}<br />

What about cost functionals to be minimized?<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 6 / 18


Find α ∗ by means of v<br />

Finite horizon problem<br />

Find optimal trajectories from HJB<br />

Given v(x, t) <strong>for</strong> any x ∈ R n and t ∈ [0, T ], we define<br />

αfeedback ∗ (x, t) := arg max {f (x, a) · ∇ xv(x, t) + r(x, a)}<br />

a∈A<br />

or, coming back to the original variables,<br />

αfeedback ∗ (y, s) := arg max {f (y, a) · ∇ xv(y, s) + r(y, a)}.<br />

a∈A<br />

Then, the optimal control is<br />

α ∗ (s) = α ∗ feedback (y ∗ (s), s)<br />

where y ∗ (s) is the solution of<br />

{ ẏ ∗ (s) = f (y ∗ (s), α ∗ feedback (y ∗ (s), s)), s > t<br />

y(t) = x<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 7 / 18


PMP<br />

Finite horizon problem<br />

PMP approach<br />

Theorem (Pontryagin Minimum Principle)<br />

Assume α ∗ is optimal and y ∗ is the corresponding trajectory. Then there<br />

exists a function p ∗ : [t, T ] → R n (costate) such that<br />

⎧<br />

ẏ ∗ (s) = f (y ∗ (s), α ∗ (s))<br />

⎪⎨<br />

ṗ ∗ (s) = −∇ x f (y ∗ (s), α ∗ (s)) · p ∗ (s) − ∇ x r(y ∗ (s), α ∗ (s))<br />

{<br />

}<br />

⎪⎩ α ∗ (s) = arg max f (y ∗ (s), a) · p ∗ (s) + r(y ∗ (s), a)<br />

a∈A<br />

with initial condition y ∗ (t) = x and terminal condition p ∗ (T ) = ∇g(y ∗ (T )).<br />

PMP can fail!<br />

Along the optimal trajectory the Hamiltonian H(y ∗ , p ∗ , a ∗ ) = f ∗ · p ∗ + r ∗<br />

may not be an explicit function of the control inputs.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 8 / 18


Finite horizon problem<br />

HJB↔PMP connection<br />

HJB↔PMP connection<br />

Theorem<br />

If v ∈ C 2 , then<br />

p ∗ (s) = ∇ x v(y ∗ (s), s), s ∈ [t, T ].<br />

The gradient of the value function gives the optimal value of the costate<br />

all along the optimal trajectory, in particular <strong>for</strong> s = t!<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 9 / 18


Finite horizon problem<br />

<strong>Numerical</strong> approximation<br />

Direct methods<br />

The control problem is entirely discretized and it is written in the <strong>for</strong>m<br />

Discrete problem<br />

Find ξ ∗ such that J(ξ ∗ ) = max ξ∈R n J(ξ), with J : R n → R.<br />

Fix a grid (s 1 , . . . , s n , . . . , s N ) in [t, T ] with s n − s n−1 = ∆s. A discrete<br />

control function α is characterized by the vector (α 1 , . . . , α N ) with<br />

α n = α(s n ).<br />

Given (α 1 , . . . , α N ), the ODE is discretized, <strong>for</strong> example, by<br />

and so it is the payoff, <strong>for</strong> example<br />

y n+1 = y n + ∆s f (y n , α n ).<br />

J(α) = ∑ n<br />

r(y n , α n )∆s + g(y N ) + penalization <strong>for</strong> ctrl and state constr.<br />

Then, a gradient method is used to maximize J.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 10 / 18


Finite horizon problem<br />

<strong>Numerical</strong> approximation<br />

Semi-Lagrangian discretization of HJB<br />

{<br />

v t (x, t) + max f (x, a) · ∇x v(x, t) + r(x, a) } = 0, x ∈ R n , t ∈ [0, T )<br />

a∈A<br />

Fix a grid in Ω × [0, T ], with Ω ⊂ R n bounded. Steps: ∆x, ∆t. Nodes:<br />

{x 1 , . . . , x M }, {t 1 , . . . , t N }. Discrete solution: wi n ≈ v(x i , s n ).<br />

{<br />

wi<br />

n − w n−1<br />

i<br />

w<br />

n ( x i + ∆t f (x i , a) ) }<br />

− wi<br />

n<br />

+ max<br />

+ r(x i , a) = 0<br />

∆t a∈A<br />

∆t<br />

w n−1<br />

i<br />

= max<br />

a∈A<br />

⎧<br />

⎫<br />

⎪⎨<br />

⎪⎩ w n( x i + ∆t f (x i , a) ) ⎪⎬<br />

+r(x i , a)<br />

} {{ } ⎪⎭ = 0<br />

to be interpolated<br />

CFL condition (not needed but useful)<br />

∆t max |f (x, a)| ≤ ∆x<br />

x,a<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 11 / 18


Finite horizon problem<br />

<strong>Numerical</strong> approximation<br />

Shooting method <strong>for</strong> PMP<br />

Find the solution of S(p 0 ) = 0, where<br />

S(p 0 ) := p(T ) − ∇g(y(T ))<br />

and p(T ) and y(T ) are computed solving the ODEs<br />

⎧<br />

⎪⎨<br />

⎪⎩<br />

α n = arg max<br />

a∈A {f (y n , a) · p n + r(y n , a)}<br />

y n+1 = y n + ∆s f (y n , α n )<br />

p n+1 = p n + ∆s (−∇ x f (y n , α n ) · p n − ∇ x r(y n , α n ))<br />

with only initial conditions<br />

y 1 = x and p 1 = p 0<br />

The solution of S(p 0 ) = 0 can be found by means of an iterative method<br />

like bisection, Newton, etc.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 12 / 18


Finite horizon problem<br />

<strong>Numerical</strong> approximation<br />

HJB↔PMP <strong>for</strong> numerics<br />

Idea [CM10]<br />

1 Solve HJB on a coarse grid<br />

2 Compute ∇ x v(x, 0) = p 0<br />

3 Use it as initial guess <strong>for</strong> the shooting method.<br />

Advantages<br />

1 Fast in dimension ≤ 4, feasible in dimension 5-6.<br />

2 Highly accurate<br />

3 Reasonable guarantee to converge to the global maximum of J.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 13 / 18


HJB equation<br />

Minimum time problem with target<br />

HJB approach<br />

Value function<br />

v(x) := min<br />

α∈A J x[α], x ∈ R n (t = 0)<br />

with cost functional<br />

∫ τ (<br />

)<br />

J x [α] = r y x,α (s), α(s) ds + g(y x,α (τ)), τ := min{s : y x,α (s) ∈ T }<br />

0<br />

Theorem (HJB equation)<br />

Assume that v ∈ C 1 . Then v solves the stationary equation<br />

max {−f (x, a) · ∇ xv(x) − r(x, a)} = 0, x ∈ R n \T<br />

a∈A<br />

with boundary conditions<br />

v(x) = g(x),<br />

x ∈ ∂T<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 14 / 18


Minimum time problem with target<br />

HJB approach<br />

HJB equation<br />

Eikonal equation<br />

If f (x, a) = c(x)a, A = B(0, 1), r ≡ 1, and g ≡ 0, we get<br />

or, equivalently,<br />

with boundary conditions<br />

max {c(x)a · ∇v(x)} = 1, x ∈<br />

a∈B(0,1) Rn \T<br />

c(x)|∇v(x)| = 1, x ∈ R n \T<br />

v(x) = 0,<br />

x ∈ ∂T<br />

<strong>Optimal</strong> trajectories ≡ characteristic lines ≡ gradient lines<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 15 / 18


PMP<br />

Minimum time problem with target<br />

PMP approach<br />

Theorem (Pontryagin Minimum Principle)<br />

Assume α ∗ is optimal and y ∗ is the corresponding trajectory. Then there<br />

exists a function p ∗ : [0, τ] → R n (costate) such that<br />

⎧<br />

ẏ ∗ (s) = f (y ∗ (s), α ∗ (s))<br />

⎪⎨<br />

ṗ ∗ (s) = −∇ x f (y ∗ (s), α ∗ (s)) · p ∗ (s) − ∇ x r(y ∗ (s), α ∗ (s))<br />

{<br />

}<br />

⎪⎩ α ∗ (s) = arg max f (y ∗ (s), a) · p ∗ (s) + r(y ∗ (s), a)<br />

a∈A<br />

and<br />

f (y ∗ (τ), α ∗ (s)) · p ∗ (s) + r(y ∗ (s), α ∗ (s)) = 0, s ∈ [0, τ] (HJB)<br />

with initial condition y ∗ (0) = x.<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 16 / 18


Minimum time problem with target<br />

Numerics <strong>for</strong> HJB<br />

Semi-Lagrangian discretization of HJB<br />

{ (<br />

w i = min w xi + ∆t f (x i , a) ) + ∆t r(x i , a) }<br />

a∈A<br />

Iterative solution<br />

The fixed-point problem can be solved iterating the scheme until<br />

convergence, starting from any initial guess<br />

w (k+1)<br />

i<br />

w (0)<br />

i<br />

=<br />

{<br />

= min w<br />

(k) ( x i + ∆t f (x i , a) ) + ∆t r(x i , a) }<br />

a∈A<br />

{ +∞ xi ∈ R n \T<br />

g(x i )<br />

x i ∈ ∂T<br />

CFL condition (not needed but useful)<br />

∆t max |f (x, a)| ≤ ∆x<br />

x,a<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 17 / 18


Minimum time problem with target<br />

Shooting method <strong>for</strong> PMP<br />

Numerics <strong>for</strong> PMP<br />

Find the solution of S(p 0 , τ) = 0 where<br />

(<br />

)<br />

S(p 0 , τ) := y(τ) − T , f (y(τ), α(τ)) · p(τ) + r(y(τ), α(τ))<br />

and y(τ), p(τ) and α(τ) are computed solving the ODEs<br />

⎧<br />

⎪⎨<br />

⎪⎩<br />

α n = arg max<br />

a∈A {f (y n , a) · p n + r(y n , a)}<br />

y n+1 = y n + ∆t f (y n , α n )<br />

p n+1 = p n + ∆t (−∇ x f (y n , α n ) · p n − ∇ x r(y n , α n ))<br />

with only initial conditions<br />

y 1 = x and p 1 = p 0<br />

E. Cristiani (IAC–CNR) <strong>Numerical</strong> <strong>Methods</strong> <strong>for</strong> <strong>Optimal</strong> <strong>Control</strong> Pbs March 2013 18 / 18

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!