Numerical analysis of time discretization of optimal control problems

Numerical analysis of time discretization of 

optimal control problems 

J. Frédéric Bonnans 

INRIA Saclay and CMAP, Ecole Polytechnique 

Applied and Numerical Optimal Control 

ITN-SADCO, 23-27 April 2012, Paris

I: ORIENTATION 

We consider the problem of minimizing the cost function 

T 

0 

as well as 

ℓ(ut,yt)dt+φ(y0,yT) subject to: ˙yt = f(ut,yt), t ∈ (0,T), 

Control constraints: c(ut) ≤ 0, t ∈ (0,T) 

State constraints: g(yt) ≤ 0, t ∈ (0,T) 

Mixed state and control constraints: c(ut,yt) ≤ 0, t ∈ (0,T) 

Initial-final equality and inequality constraints: 

Φi(y0,yT) = 0, i = 1,...,r1, 

Φi(y0,yT) ≤ 0, i = r1+1,...,r. 

1

This class of problems is quite large: 

• It includes the case of design parameters = states with zero derivative 

Special case: variable horizon ˙yt = Tf(ut,yt), t ∈ (0,1) 

• If data depend smoothly on time, we may set time as a state with 

derivative equal to 1 

• Multiphases systems (separation of stage rockets) enter in this 

framework by setting “phases in parallel” 

• We may skip the integral cost, adding ˙yn+1 = ℓ(ut,yt). 

The new cost is yn+1,T +φ(y0,yT) 

• It includes some delay systems (H. Maurer) 

2

Function spaces 

• Control and state spaces 

U := L ∞ (0,T;R m ); Y := W 1,∞ (0,T;R n ). 

Their extension to Hilbert spaces 

U2 := L 2 (0,T;R m ); Y2 := H 1 (0,T;R n ). 

3

The Euler discretization 

• N: number of time steps, hk > 0 duration of kth time step 

• Steps begin/end at time t0 = 0, and for k = 1 to N, tk = k 

j=0 hk 

• State equation: yk+1 = yk +hkf(uk,yk), k = 0,...,N −1. 

• Cost function: φ(yN) 

Running constraints: 

c(uk) ≤ 0; g(yk) ≤ 0; c(uk,yk) ≤ 0, k = 1,...,N −1 

Final equality and inequality constraints: 

Φi(y0,yN) = 0, i = 1,...,r1, 

Φi(y0,yN) ≤ 0, i = r1+1,...,r. 

4

Basic questions on the numerical analysis 

Given a nominal local solution (ū,¯y) of the original problem: 

• Has the discretized problem a solution (u h ,y h ) near (ū,¯y) ?? 

• Error order u h −ū+y h − ¯y = O( ¯ h), where ¯ h := maxkhk ? 

• Design of higher-order schemes ? 

• Assumptions of (piecewise) smooth solutions: it is true ? 

• How do we solve the discretized problem ? 

5

The simplest optimal time problem I 

Reach the zero state: dynamics ¨xt = ut ∈ [−1,1]. 

0.5 

0.4 

0.3 

0.2 

0.1 

-0.1 

-0.2 

-0.3 

-0.4 

-0.5 

0 

-0.16 -0.12 -0.08 -0.04 0 0.04 0.08 0.12 0.16 

Figure 1: Control synthesis: state space 

6

The simplest optimal time problem II 

• Solution: Bang-bang optimal control, at most one switching time 

• Discretized solution of same nature (costate affine function of time) 

• Exact integrators control constant over a time: mid point rule 

• In that case, error only due to the switching time step 

• Expected error: at most O( ˜ h), with ˜ h = time step a switching time. 

Ref. for LQ bang-bang problems Alt, Baier, Gerdts, Lempio, Error 

bounds for Euler approximation of linear-quadratic control problems with 

bang-bang solutions. Preprint, 2010. 

7

Fuller’s problem I (work with J. Laurent-Varin) 

Same dynamics: ¨xt = ut ∈ [−1,1]; Integral cost T 

0 x2 tdt. 

1.0 

0.8 

0.6 

0.4 

0.2 

0.0 

−0.2 

−0.4 

−0.6 

−0.8 

−1.0 

0 1 2 3 4 5 6 7 8 

Figure 2: Fuller problem: optimal control, logarithmic penalty 

8

Fuller’s problem II 

• Known true solution (Fuller 1963) 

• Sequence of bang-bang arcs whose length geometrically converges to 0 

• Followed by singular arc: u = 0 and x = 0. 

• Again we can use “exact” integrators for a control constant over time 

steps 

• Averaging effect: loss of optimality maybe smaller than it seems. 

9

Robbins state constrained problem I 

• Dynamics: ¨x (3) 

t = ut ∈ [−1,1]; 

• cost T 

0 (xt+u 2 t)dt, constraint x ≥ 0. 

• Optimal state: infinitely many isolated touch points converging to the 

entry point of a boundary arc x = 0. 

• Optimal control: infinitely many damped oscillations followed by an arc 

with zero values of the control 

10

Robbins state constrained problem II 

1.0 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0.0 

0 1 2 3 4 5 6 7 

Figure 3: Robbins problem: exact solution, plot in A. Hermant PhD thesis 

11

Robbins state constrained problem III 

Figure 4: Robbins problem: Bocop output of the control 

12

Robbins state constrained problem IV 

0.03 

0.02 

0.01 

0 

-0.01 

-0.02 

-0.03 

-0.04 

-0.05 

Mesh Reffinement on a given interval 

N=200 

N=200 + 1000 on [4;6] 

4.6 4.8 5 5.2 5.4 5.6 5.8 6 

Figure 5: Robbins problem: Refinement of grid discretization 

13

Beam problem: a second-order state constraint 

0.28 

0.24 

0.20 

0.16 

0.12 

0.08 

0.04 

0 

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

Figure 6: Beam problem 

Dynamics ¨xt = ut ∈ [−1,1]. Cost 1 

0 u2 tdt; constraint x ≤ xMAX. 

Drawing: optimal displacement function of xMAX. 

14

II: UNCONSTRAINED PROBLEMS: minimize the cost function 

T 

ℓ(ut,yt)dt+φ(y0,yT) 

subject to 

0 

˙yt = f(ut,yt), t ∈ (0,T); y0 = y 0 . 

Optimality conditions: Costate equation along the trajectory (ū,¯y): 

−˙¯pt = ¯ptfy(ūt,¯yt), t ∈ (0,T); ¯pT = φ ′ (¯yT). 

Pontryagin’s principle (PMP): H[p](u,y) := pf(u,y) 

and in particular 

H[¯pt](ūt,¯yt) ≤ H[¯pt](u,¯yt), for all u ∈ R m , for a.a. t 

Hu[¯pt](ūt,¯yt) = 0, t ∈ (0,T). 

15

Discretization by Euler’s method 

⎧ 

⎨ 

⎩ 

Min φ(yN); 

yk+1 = yk +hkf(uk,yk), k = 0,...,N −1, 

y0 = y 0 . 

hk > 0: kth step size; the discretized times: t0 = 0, 

Lagrangian: 

tk := 

φ(yN)+p0(y 0 −y0)+ 

k−1 

i=0 

(1) 

hi, k = 1,...,N. (2) 

N−1 

k=0 

pk+1(yk +hkf(uk,yk)−yk+1). 

16

Optimality systems (original and discretized problem) 

For the original problem: 

⎧ 

⎪⎨ 

⎪⎩ 

˙yt = f(ut,yt), 

−˙¯pt = pfy(ut,yt), t ∈ (0,T); 

0 = pfu(ut,yt), t ∈ (0,T); 

y0 = y 0 ; pT = φ ′ (¯yT), 

and for the discretized problem 

⎧ 

yk+1−yk 

⎪⎨ 

hk 

pk −pk+1 

= f(uk,yk), 

= pk+1fy(uk,yk), k = 0,...,N −1, 

⎪⎩ 

hk 

0 = pk+1fu(uk,yk), k = 0,...,N −1, 

y0 = y 0 ; pN = φ ′ (yN). 

17

Reduction by elimination of the control: continuous problem 

Reduction hypothesis: ū continous, and 

Huu[¯pt](ūt,¯yt) = ¯ptfuu(ūt,¯yt) uniformly invertible. 

Then by the IFT (Implicit function theorem), “locally in time” 

Reduced optimality system: 

⎧ 

⎨ 

⎩ 

Hu[p](u,y) = 0 iff u = Υ(p,y). 

˙yt = f(Υ(yt,pt),yt), 

−˙¯pt = pfy(Υ(yt,pt),yt), t ∈ (0,T); 

y0 = y 0 ; pT = φ ′ (¯yT), 

18

Shooting problem: continuous problem 

• Let p[p0], y[p0] denote the solution of the previous ODE with initial 

condition (y 0 ,p0). 

• Shooting function: S(p0) = pT[p0]−φ ′ (yT[p0]). 

• Optimality system ⇔ Shooting equation: S(p0) = 0. 

• Assume ¯p0 well-posed solution: S(¯p0) = 0, and S ′ (¯p0) invertible. 

• Then locally: ¯p0 unique solution, Newton’s method converges, and 

sensitivity analysis thanks to IFT. 

19

Shooting problem: discretized problem 

• Reduced formulation by elimination of the control: 

⎧ 

⎪⎨ 

⎪⎩ 

yk+1−yk 

hk 

pk −pk+1 

hk 

= f(Υ(yk,pk),yk), 

= pk+1fy(Υ(yk,pk),yk), k = 0,...,N −1, 

y0 = y 0 ; pN = φ ′ (yN). 

• Let p h [p0], y h [p0] denote the solution of the previous ODE with initial 

condition (y 0 ,p0) (well-defined if p0 close to ¯p0 and ¯ h small) 

• Optimality system ⇔ S h (p0) := p h T [p0]−φ ′ (y h T [p0]) = 0. 

20

Partitioned Euler integrators 

• Consider the partitioned ODE 

˙y = F(y,p); ˙p = G(y,p) 

• Associated partitioned Euler scheme: 

yk+1−yk 

hk 

= F(yk,pk+1); 

pk+1−pk 

hk 

= G(yk,pk+1); 

• Here: F(y,p) = f(Υ(y,p),y); G(y,p) = −pfy(Υ(y,p),y). 

• It can easily be checked that 

S h → S locally uniformly as well as its derivatives. 

21

Error analysis 

• F: set of C 1 mappings R n → R n in ¯ B(¯p0,1). 

• Define Ξ : R n ×F → R n , (p0,F) ↦→ Ξ(p0,F) := F(p0). 

• Clearly Ξ is C 1 and ∂Ξ(p0,F) 

∂p0 

= F ′ (p0). 

• If invertible, we can apply the IFT (Banach space setting) to Ξ at (¯p0,S). 

• Conclusion: there exists a locally unique ¯p h 0 solution of S h (¯p h 0) = 0 

• Error estimate: |¯p h 0 − ¯p0| = O(S h −S) = O( ¯ h). 

More precisely: |¯p h 0 − ¯p0| = O(S h (¯p0)|). 

22

What did we get ? 

• We deduce a uniform error estimate 

|ūt k −u h k|+|¯yt k −y h k|+|¯pt k −k h k| = O( ¯ h). 

• Thanks to the shooting approach, the analysis becomes trivial. 

• Link to second-order conditions ? 

• Higher-order methods ? 

23

Second-order optimality conditions 

• Cost function of control: J(u) := φ(yT[u]); class C ∞ : U → R. 

• Second-order necessary optimality condition: J ′′ (ū) 0. 

• Continuous extension of the quadratic form J ′′ (ū) to U2. 

• Second-order sufficient optimality condition SOSC: for some α > 0 

J ′′ (ū)(v,v) ≥ αv 2 2, for all v ∈ U2. 

• It characterizes quadratic growth: for some ε > 0 and any α ′ < α, 

J(ū+v) ≥ J(ū)+ 1 

2 α′ v 2 2, if v∞ ≤ ε. 

24

Computation of J ′′ (ū) 

• linearized state equation 

˙zt = Df(ūt,¯yt)(vt,zt), t ∈ (0,T), z0 = 0. 

• Hessian of Lagrangian, quadratic form over U, where z = z[v]: 

Ω(v) := 1 

2 

T 

0 H′′ [¯pt](ūt,¯yt)(vt,zt) 2 dt+ 1 

2 φ′′ (¯yT)(zT,zT). 

• Coincides with Hessian of reduced cost: J ′′ (ū)(v,v) = Ω(v) 

25

Link with the shooting formulation 

• Let ū be a weak minimum (local minimum in U) 

• Then S ′ (¯p0) invertible iff the SOSC holds 

• Final result: in that case we have the estimate on |ph 0 − ¯p0|, and the 

latter implies: 

 

h maxk |uk −ūt |+|y k h k − ¯yt |+|p k h k − ¯pt | k = O( ¯ h). 

26

Extension to general initial-final conditions 

• Equality initial-final constraints Φ(y0,yT) = 0, cost φ(y0,yT) = 0. 

• Reduction of inequalities into equalities in case of strict complementarity 

with a unique multiplier 

• Associated multiplier Ψ ∈ R n Φ. 

• Costate equation 

−˙¯pt 

= ¯ptfy(ūt,¯yt), t ∈ (0,T); 

(−¯p0, ¯pT) = φ ′ (¯y0,¯yT)+ΨΦ ′ (¯y0,¯yT). 

• Shooting variables: (y0,p0,Ψ). Similar analysis 

qualification + SOSC implies an O( ¯ h) error estimate. 

27

A priori parameterized control 

• We often have an a priori parameterized control (technological 

constraints). 

E.g., piecewise polynomial control, with given switching times τi, i ∈ I 

• We add “explicitely” a vector π of optimization parameters. 

• We add time as an additional state 

• We map each interval (τi,τi+1) to (0,1). 

• Junction condition: continuity of the state 

• By doing so we reduce to the standard framework: 

qualification + SOSC implies an O( ¯ h) error estimate. 

28

High-order methods 

• High-order methods for unparametric control: use of high-order onestep 

methods as Runge-Kutta (RK) schemes. 

• Inner states of RK schemes: what to do with the control ? 

29

Example: mid-point rule 

• If uk constant over the time step, a second-order scheme is 

yk+1−yk 

hk 

• Equivalent formulation in the Runge-Kutta style: 

= f(uk, 1 

2 (yk +yk+1)); MPR 

yk1 = yk + 1 

2 hkf(uk,yk1); 

yk+1 = yk +hkf(uk,yk1) 

• Same computational effort for the formulation (MPR) as for the 

Euler scheme. 

30

General RK solvers for ˙yt = f(yt) 

• The s inner states yki, i = 1 to s, satisfy 

s yk+1 = yk +hk i=1bif(yki), s yki = yk +hk j=1aijf(ykj), with a a s×s matrix and b ∈ Rs . Set ci := 

jaij. c a 

Butcher array (c is optional): 

b 

• Explicit, implicit Euler and Mid-point rule: 

0 0 

1 

1 1 

1 

1/2 1/2 

1 

31

Order computation “by hand” 

Denote f ′ , f ′′ , etc, for the derivatives of f, and use e.g; f ′ f for 

f ′ (f(yt)): 

˙yt = f; ¨yt = f ′ f, 

y (3) 

t = f ′′ (f,f)+f ′ f ′ f, 

y (4) 

t = f ′′′ (f,f,f)+3f ′′ (f ′ f,f)+f ′ f ′′ (f,f)+f ′ f ′ f ′ f. 

By induction: for any integer k, the expression of y (k) 

t is a linear combination 

with positive weights of elementary differentials of size k which are 

compositions of f (i) , 0 ≤ i ≤ k. The symbol f appears k times, and f (i) 

has i arguments. 

Each elementary differential can be identified with a rooted tree with 

k nodes, and a general (inductive) expression of the coefficients is known 

(Butcher, Hairer, Wanner). 

(3) 

32

Order of a one-step method 

• General one-step method: yk+1 = yk +hkΦ(yk,hk) 

• Consistency condition: Φ(y,0) = f(y) 

• Taylor expansion: yk+1(h) = yk +hf(yk)+ 1 

2 h2 ··· 

• Global error order: maximum power for which the expansion of the 

scheme coincides with the one of the ODE 

• Euler method: Φ(y,h) = f(y). 

Error order 1, principal error term 1 

2 h2 f ′ f. 

• General case: analysis based on the theory of rooted trees. 

33

Order of a RK scheme I 

• We formally expand w.r.t. h = hk the amount yki(h) = 

yk+1(h) = yk +h s 

i=1 bif(yki(h)), 

yki(h) = yk +h s 

j=1 aijf(ykj(h)), 

For q = 0: yki0 = yk = yk+1, and for q = 1: 

q≥0 

yk +hyk+1,1 = yk +h s 

i=1 bif(yk))+O(h 2 ), 

yk +hyki1 = yk +h s 

j=1 aijf(yk))+O(h 2 ) 

After simplification: 

yk+1,1 = s 

i=1 bif(yk)+O(h 2 ), 

yki1 = s 

j=1 aijf(yk)+O(h 2 ) 

h q 

q! ykiq: 

34

Order of a RK scheme II 

• By induction: explicit expansion, using 

q 

ℓ=0 

h q 

q! ykiq = yk +h 

s 

j=1 

aij 

q−1 

ℓ=0 

f(ykj(h))+O(h q+1 ). 

• Expansion: linear combination of the elementary differentials (as 

for the solution of the ODE) 

• Global error order p iff, in the expansion of yk+1, these coefficients 

coincide up to order p. 

35

Exercice: order 2 

yk +hyk+1,1+ 1 

2h2yk+1,2 = yk +h s i=1bif(yk +hyki1))+O(h 3 ), 

yk +hyki1+ 1 

2h2yki2 = yk +h s j=1aijf(yk +hyki1))+O(h 3 ) 

Using yki1 = s 

j=1 aijf(yk))+O(h 2 ) we deduce that 

and so 

1 

2 yki2 = 

s 

j=1 

aijf ′ (yk)f(yk) = cif ′ (yk)f(yk), 

1 

2 yk+1,2 = 

s 

bicif ′ (yk)f(yk) 

i=1 

Since yt = y0+tf + 1 

2 t2 f ′ f, the scheme is (at least) of second order iff 

s 

i=1 

bici = 1 

2 . 

36

High-order RK schemes for optimal control: the Hager (2000) 

approach Independent control associated with each inner state: 

Min φ(yN); 

s yk+1 = yk +hk i=1bif(uki,yki), s yki = yk +hk j=1aijf(ukj,ykj), k = 0,...,N −1, 

y0 = y0 . 

This will be justified by the analysis of the optimality system ! 

Equivalent form: 

Min φ(yN); 

⎧ 

⎪⎨ 

⎪⎩ 

s 

0 = hk biKki+yk −yk+1, 

i=1 s 0 = f(uki,yk +hk j=1aijKkj)−Kki, 0 = y0 −y0. 

37

s Lagrangian (contracting yk +hk j=1aijKkj into yki): 

+ 

N−1 

k=0 

 

pk+1 

⎪⎩ 

 

hk 

φ(yN)+p 0 (y 0 −y0) 

s 

 

s 

 

biKki+yk −yk+1 + ξki(f(uki,yki)−Kki) . 

i=1 

Assuming that bi = 0 for all i, set pki := ξki/(hkbi), ˆbi := bi, âij := 

bj − bj 

bi aji, i,j = 1,...,s. The optimality system is 

⎧ s yk+1 = yk +hk i=1 

⎪⎨ 

bif(uki,yki), 

s yki = yk +hk j=1aijf(ukj,ykj), s pk+1 = pk −hk i=1 ˆ 

biHy[pki](uki,yki), 

s 

pki = pk −hk j=1âijHy[pki](uki,yki), i=1 

0 = Hu[pki](uki,yki), 

y0 = y 0 , pN = φ ′ (yN), 

38

Elimination of the control (PRK) scheme 

Eliminate uki = Υ(yki,pki), we get a discretization of the state-costate 

dynamics: 

⎧ 

⎪⎨ 

⎪⎩ 

s yk+1 = yk +hk i=1bif(Υ(yki,pki),yki), s yki = yk +hk j=1aijf(Υ(yki,pki),ykj), s pk+1 = pk −hk i=1 ˆ 

biHy[pki](Υ(yki,pki),yki), 

s 

pki = pk −hk j=1âijHy[pki](Υ(yki,pki),yki), y0 = y0 , pN = φ ′ (yN), 

39

Partitionned Runge-Kutta (PRK) scheme 

Given the partitioned Cauchy problem 

˙yt = g ♯ (yt,pt) 

˙pt = g ♭ (yt,pt) 

Define the partitionned Runge-Kutta (PRK) scheme with coefficients 

(a,b,â, ˆ b) as 

⎧ 

⎪⎨ 

⎪⎩ 

s yk+1 = yk +hk i=1big♯ 

(yki,pki), 

s 

yki = yk +hk j=1aijg ♯ (yki,pki), 

s pk+1 = pk +hk i=1 ˆbig ♭ 

(yki,pki), 

s 

pki = pk +hk j=1âijg ♭ (yki,pki), 

y0 = y0 , pN = φ ′ (yN). 

40

Discretize + Optimize do commute ! 

discretization 

(P) −−−−−−−−−−→ (DP) 

⏐ ⏐ 

optimality ⏐ optimality ⏐ 

 

conditions conditions 

(OC) discretization 

−−−−−−−−−−→ (DOC) 

Error orders: notation EO(a,b), EO(a,b,â, ˆ b). By Hager (2000) 

2 ≤ EO(a,b,â, ˆ b) ≤ EO(a,b) iff EO(a,b) ≥ 2 

If EO(a,b) ≥ 3, we may have EO(a,b,â, ˆ b) < EO(a,b). 

Equality for the Gauss discretization, for fourth order explicit schemes... 

(D) 

41

Conditions for order 1 to 3: dj := 

i biaij 

Table 1: Order 1 

Graph Condition 

bi = 1 



Graph Condition 

dj = 1 

2 

Graph Condition Graph Condition 

 

cjdj = 1 

bic 

6 

2 i = 1 

 

3 

1 

d 2 k = 1 

3 

bk 

42

Conditions for order 4 

bk 



1 

alkdkdl = 

bk 

1 

ajkdjck = 

8 

1 

24 

bi 

aikcidk = 

bk 

5 

biaijcicj = 

24 

1 

8 

 

2 

cjdj = 1 

bic 

12 

3 i = 1 

 

4 

1 

ckd 2 k = 1 1 

d 

12 

3 l = 1 

4 

Orders 5 to 7: FB & J. Laurent-Varin (2006), making the link to the 

theory of partitioned RK schemes and bicolored rooted trees. 

b 2 l 

43

Conditions for order 5 

Table 5: Ordre 5 


bi 

aikaildkcl = 

bk 

3 

40 

 

alkakjcjdl = 1 

120 

bi 

alkailcidk = 

bk 

11 

120 

bibj 

ajkaikcicj = 

bk 

2 

15 

1 

amlalkdkdm = 

bk 

1 

30 

1 

almd 

blbm 

2 

ldm = 1 

15 

1 

amld 2 

ldm = 1 

10 

b 2 l 

biaikaijcjck = 1 

20 

 

biaikakjcicj = 1 

bi 

bk 

bi 

blbm 

1 

bk 

1 

bk 

bi 

30 

alkaikcidl = 3 

40 

aimaildldm = 2 

15 

amkalkdldm = 1 

b 2 l 

akld 2 

k cl = 1 

60 

ailcid 2 

l 

= 3 

20 

20 

44

Table 5: Ordre 5 


1 

alkdkcldl = 

bk 

7 

120 

1 

alkckdkdl = 

bk 

1 

40 

bi 

aikc 

bk 

2 

idk = 3 

20 

 

akjc 2 

jdk = 1 

 

60 

3 

cjdj = 1 

 

20 

1 

c 

bk 

2 

kd2 1 

k = 

30 

1 

d 4 1 

m = 

5 

b 3 m 

ajkcjdjck = 1 

40 

aikcickdk = 

bk 

7 

120 

 

biaijc 2 

icj = 1 

10 

 

biaijcic 2 1 

j = 

 

15 

bic 4 1 

i = 

 

5 

1 

cld 3 1 

l = 

20 

bi 

b 2 l 

45

Number of conditions for each order 

Table 6: Number of order conditions 

Order 1 2 3 4 5 6 7 

Simple 1 1 2 4 9 20 48 

“Symplectic” 1 1 3 8 27 91 350 

Partitioned 2 4 14 52 214 916 4116 

Above by symplectic schemes we mean those for which ˆ b = b and â is 

obtained as when deriving optimality systems. 

We can say more about that ! 

46

Symplectic schemes 

 

0 I 

Consider the 2n×2n matrix J := . 

−I 0 

Given H smooth: Rn ×Rn → R, the associated Hamiltonian system is 

can be written as (note that J −1 = −J) 

˙p = −Hq(p,q); ˙q = Hp(p,q) (4) 

d 

dt 

p 

q 

 

= J −1 DH(p,q), (5) 

and the variational equation (linearization) may be written as 

d 

dt 

Zp 

Zq 

 

= J −1 D 2 H(p,q) 

Zp 

Zq 

 

. (6) 

47

Definition 1. (i) A linear mapping A : R 2n → R 2n is called symplectic if 

it satisfies A ⊤ JA = J. 

(ii) A differentiable function ϕ : R 2n → R 2n is called symplectic at 

(p,q) ∈ R 2n , if the Jacobian matrix is symplectic, i.e., if 

ϕ ′ (p,q)Jϕ ′ (p,q) = J. (7) 

We say that ϕ is symplectic if it is symplectic at all points. 

48

Theorem 1 (Poincaré). Let H(p,q) : R 2n → R be of class C 2 . Then the 

associated Hamiltonian flow is symplectic. 

Proof. Denote the flow by ϕt. The amount Ψt := ∂ϕt(y0)/∂y0 is solution 

of the variational equation, and hence, skipping the arguments of H: 

d 

dt (Ψt·JΨt) = ˙ Ψt·JΨt+Ψt·J ˙ Ψt = ΨtD 2 HJ −⊤ JΨt+Ψt·JD 2 HΨt = 0, 

since J −⊤ J −1 is the identity matrix. Therefore Ψt·JΨt is invariant along 

the trajectory, and hence, equal to its initial value J, as was to be proved. 

 

Theorem 2 (Bochev-Scovel 1994). The partitioned Runge-Kutta schemes 

derived from the optimality system are symplectic. 

49

Back to the mid-point rule 

• The method is, in short: 

yk+1−yk 

hk 

= f(uk, 1 

2 (yk +yk+1)); MPR 

We have seen the equivalent formulation in the Runge-Kutta manner: 

The Butcher array is 

yk1 = yk + 1 

2 hkf(uk,yk1); 

yk+1 = yk +hkf(uk,yk1) 

1/2 1/2 

1 

so that s 

i=1 bici = 1 

2 : 

• The scheme and the associated symplectic scheme are of second order. 

50

Orientation 

• The shooting approach gives a simple approach for the error analysis of 

unconstrained problems 

• It easily extends to the case of initial-final state constraints, assuming 

strict complementarity, and to the one of parameterized control. 

• Possible extension without strict complementarity using weak hypotheses 

(FB, Appl Mat Opt 94), or strong regularity (Robinson 80) 

• When discontinuous control: expected O( ¯ h) error ??? 

• Case of control constraints: we might extend the notion of shooting 

function, but the latter is nonsmooth. What can we do ? 

51

Synthesis of part II: unconstrained optimization 

⎧ 

⎨ 

⎩ 

Min φ(yT); 

˙yt = f(ut,yt), t ∈ [0,T], 

y0 = y 0 . 

Discretization by Euler’s method 

⎧ 

⎨ 

⎩ 

Min φ(yN); 


y0 = y 0 . 

52

Synthesis of part II: unconstrained optimization (continued) 

• For a continuous control ū satisfying the standard second order sufficient 

conditions: maxk|uk −ūt k | = O( ¯ h). 

• For a RK scheme with associated symplectic scheme of order q: 

maxk|uk −ūt k | = O( ¯ h q ). 

• Technique based on homotopy on the shooting formulation. 

• Refs. Hager (2000), FB & J. Laurent-Varin (2006). 

53

III: CONTROL CONSTRAINTS (Hager, Dontchev, Veliov 2000) 

⎧ 

⎪⎨ 

⎪⎩ 

Min φ(yT); 

˙yt = f(ut,yt), t ∈ [0,T], 

c(ut) ≤ 0, t ∈ [0,T], 

y0 = y0 . 

First-order optimality conditions + PMP 

⎧ 

⎪⎨ 

⎪⎩ 

pT = φ ′ (yT), 

−pt = Hy[pt](ut,yt), k = 0,...,N −1, 

0 = Hu[pt](ut,yt)+νtc ′ (ut), t ∈ [0,T], 

c(ut) = 0; νt ≥ 0; νtc(ut) = 0, t ∈ [0,T]. 

H[¯pt](ūt,¯yt) ≤ H[¯pt](u,¯yt), if c(u) ≤ 0. 

In the sequel: ū continuous solution of (P). 

54

A trivial example 

• Here time is identified with a state variable: 

Min 

u 

1 

2 

• Solution ūt = (1−t)+, ¯p = 0, H = 1 

2 

(ut−(1−t)) 2 dt; ut ≥ 0 for a.a. t 

0 = Hu+νtc ′ (ūt) = ūt−(1−t)−νt 

νt = (t−1)+. 

• All kind of string hypotheses satisfied. 

0 

2 

2 

0 (u−(1−t))2 , 

• Control continuous, with discontinuous time derivative. 

55

Active constraints 

Denote the set of active control constraints by 

I(t) := {1 ≤ i ≤ nc; ci(ūt) = 0}; 

Assume in the sequel the following qualification condition of uniform linear 

independence of gradients (ULIGA) of active constraints: for some 

αc > 0: 

|ξDc I(t)(ūt)| ≥ αc|ξ|, for all ξ and a.a. t. 

Then there exists a unique multiplier ¯ν, which is continuous in view of the 

condition 

Hu[pt](ūt,¯yt)+νtc ′ (ūt) = 0. 

56

Reduction to linear constraints 

• Apply the IFT to ci(u) = a, i ∈ I(t), |I(t)| = q. 

• We obtain that u is a smooth function of say v := 

(a1,...,aq,uq+1,...,um). 

• Locally we can take v as a new control. 

• Adding time as a state variable we can stick the reparametrizations. 

57

Minimization of the Hamiltonian 

• For t ∈ [0,T], ūt is solution of the nonlinear programming problem with 

Lipschitz data: 

Min 

u H[¯pt](u,¯yt); c(u) ≤ 0. 

• The Lagrangian function is the augmented Hamiltonian 

H c [p,ν](u,y) := pf(u,y)+νc(u). 

• Enlarged critical cone: for ε > 0, 

C ε t(ūt) := {v ∈ R m ; c ′ i (ūt) = 0, if νit > ε}. 

• “Strong” second-order optimality conditions 

H c uu[¯pt](ūt,¯yt)(v,v) ≥ α|v| 2 , for all v ∈ C ε t(ūt) 

• Then (ūt,νt) is a Lipschitz and directionally differentiable function say 

Υ of (¯yt, ¯pt). 

• Bootstrapping: ¯y and ¯p in W 2,∞ , and not more. 

58

Localization We distinguish weakly and strongly εc active constraints: 

Iεc = IW εc ∪IS εc , with 

I S εc (t) := {1 ≤ i ≤ nc; ¯νit > εc}. 

I W εc (t) := {1 ≤ i ≤ nc; ci(ūt) > −εc; ¯νit ≤ εc}. 

We next consider “localized” constraints 

 

ci(ūt) = 0, i ∈ IS εc (t), t ∈ [0,T], 

ci(ūt) ≤ 0, i ∈ IW εc (t), t ∈ [0,T], 

The idea is to forget non εc active inequality, and to change strongly active 

inequalities into equalities. The localized problem is, where ε := (εu,εc): 

Min 

u∈U φ(yT[u]); s.t. ˙y = f(u,y); (8) and u−ū∞ ≤ εu. (Pε) 

(8) 

59

Second-order optimality conditions I: Hessian of Lagrangian 

• We recall the linearized state equation 

˙zt = Df(ūt,¯yt)(vt,zt), t ∈ (0,T), z0 = 0. 

• Hessian of “unconstrained” Lagrangian, where z = z[v]: 

Ω(v) := 1 

2 

T 

0 H′′ [¯pt](ūt,¯yt)(vt,zt) 2 dt+ 1 

2 φ′′ (¯yT)(zT,zT). 

• Hessian of “control constrained” Lagrangian: 

Ωc(v) := Ω(v)+ 1 

2 

T 

0 ¯νtD 2 c(ūt)(vt,vt)dt. 

60

Second-order optimality conditions II: critical cone 

• Critical cone in U2: 

C 2 (ū) := {v ∈ U2; DJ(ū)v = 0; Dci(ūt)vt ≤ 0, i ∈ I(t), t ∈ (0,T)}. 

• Alternative expression based on the Lagrange multiplier: 

C 2 (ū) := {v ∈ U2;Dci(ūt)vt ≤ 0, ¯νitDci(ūt)vt = 0, i ∈ I0(t), t ∈ (0,T)}. 

• Strict complementarity hypothesis, 

¯νit > 0 if ci(ūt) = 0 for a.a. t, 1 ≤ i ≤ nc, 

• Enlarged critical cone 

C 2 εc (ū) := {v ∈ U2; Dci(ūt)vt = 0, i ∈ I S εc (t), t ∈ (0,T)}. (9) 

61

Second-order optimality conditions III: main result 

Theorem 3. If ū is a weak solution with qualified constraints, then 

Ωc(v) ≥ 0, for all v ∈ C 2 (ū). 

Consider the “sufficient condition”: for some αΩ > 0: 

Ωc(v) ≥ αΩv 2 2, for all v ∈ C 2 εc (ū). (10) 

Theorem 4. If ū is a weak solution that satisfies the previous hypotheses, 

for εu > 0 small enough, there exists α > 0 such that 

φ(yT[ū])+αu−ū 2 2 ≤ φ(yT[u]), if u−ū∞ ≤ εu. (11) 

62

Second-order optimality conditions IV: reduction 

Under the above hypotheses, locally, the minimization problem 

H[p](ūt,y) ≤ H[¯pt](u,y), for all u such that c(u) ≤ 0. (12) 

has, for (y,p) close enough (unif. in t) to (¯yt, ¯pt) a unique solution denoted 

by Υ(y,p), with Υ unif. Lipschitz, and multiplier ν(y,p). Reduced system: 

˙yt = f(Υ(yt,pt),yt), −˙pt = ptfy(Υ(yt,pt),yt), t ∈ [0,T], 

pT = φ ′ (yT), y0 = y 0 . 

(13) 

Again we have a shooting formulation, and the shooting function is locally 

Lipschitz, but no more differentiable: 

S(p0) := pT[p0]−φ ′ (yT[p0]). 

63

Euler discretization 

⎧ 

Min φ(yN); 

⎪⎨ yk+1−yk 

⎪⎩ 

hk 

= f(uk,yk), k = 0,...,N −1, 

c(uk) ≤ 0, k = 0,...,N −1, 

y0 = y 0 . 

First-order optimality conditions (unlocalized form) 

⎧ 

⎪⎨ 

⎪⎩ 

pk −pk+1 

hk 

pN = φ ′ (yN), 

= Hy[pk+1](uk,yk), k = 0,...,N −1, 

0 = Hu[pk+1](uk,yk)+νkc ′ (uk), k = 0,...,N −1. 

c(uk) = 0; νk ≥ 0; νkc(uk) = 0, k = 0,...,N −1. 

64

Discretized shooting function 

• Defined as: S h (p0) := p h T [p0]−φ ′ (y h T [p0]). 

• (p h T [p0],y h T [p0]) results from the (well-posed) integration of the 

discretized scheme where uk := Υ(yk,pk+1), 

• Initial guess p0 = ¯p0. 

• Error sum of two terms: O( ¯ h) (outside of junctions) + nonsmoothness 

of Υ at junctions. 

• Nonsmoothness of Υ: estimated with hypotheses on junction points 

65

Geometrical hypotheses: junction points 

• Hyp. Ii(t) finite union of intervals of positive measure. 

• Junction point: boundary of these intervals; only one entering or exiting 

constraint 

• Transversality condition at junction points 

⎧ 

⎪⎨ 

⎪⎩ 

(i) c ′ (ūt) d 

dt Υ(¯yt, ¯pt) |t=a − > 0, if a entry point of ith constraint 

(ii) c ′ (ūt) d 

dt Υ(¯yt, ¯pt) |t=b + < 0, if b exit point of ith constraint 

• Claim: this implies ˙νi > 0 (single entry point) or ˙νi < 0 (single exit 

point) 

66

Proof 

• Differentiating H c u = pfu(u,y)+νc ′ (u) = 0, get 

˙u ⊤ H c uu+ ˙νc ′ +Ξ = 0, with Ξ continuous. 

• Denoting the jump over a junction point by [·], we get 

[˙u ⊤ ]H c uu+[˙ν]c ′ = 0. 

• Let I−, I+ be the active set at time τ±. 

Multiply by [˙u] and use c ′ i [˙u] = 0 when i ∈ I1∩I2; get 

[˙u ⊤ ]H c uu[˙u]+ 

i∈I3 [˙νi]c ′ i [˙u] = 0, where I3 := I1∆I2: 

• Entry point I3 = i0: if [˙νi0 ] = 0, we get [˙u] = 0 (SOSC for the problem 

of minimizing the Hamiltonian): contradiction. 

67

Junction points for the discretized problems 

Finite difference of stationarity of augmented Hamiltonian: 

∆1 = pk+1fu(uk,yk)−pkfu(uk−1,yk−1) 

= pk+1(fu(uk,yk)−fu(uk−1,yk−1))+(pk+1−pk)fu(uk,yk)+ 

∆2 = νkc ′ (uk)−νk−1c ′ (uk−1) 

= (νk −νk−1)c ′ (uk)+νk−1(c ′ (uk)−c ′ (uk−1)) 

Since ∆1 + ∆2 = 0, we deduce with the costate equation that for ˜ H c uu 

close to H c uu: 

0 = ˜ H c uu(uk −uk−1)+(νk −νk−1)c ′ (uk)+O( ¯ h). 

Analysis similar to the one for the continuous problem: 

well-posed junction points for the discretized problem, “discontinuity” of 

uk −uk−1 and νk −νk−1. 

68

An homotopy argument 

• By the previous slide: nonsmoothness will occur at number of time steps 

= number of junction points 

• Local integration error at junctions: O(h 2 k ), and so |Sh (¯p0)| = O( ¯ h). 

• Argument based on homotopy: set δ h := S h (¯p0) and solve 

S h (p θ 0) = (1−θ)δ h . 

For θ = 0, solution ¯p0. 

Key Result: For some ε > 0, if p θ0 ∈ B(¯p0,ε), then θ ↦→ p θ 0 is locally 

well-defined with local Lipschitz constant O( ¯ h). 

• Pasting compact neighborhoods we deduce the 

Existence of p h 0 = p θ 0 such that S h (p h 0) = 0. 

• The O( ¯ h) error estimate follows, provided that we prove the Key Result. 

69

Connection with stability analysis in optimization I 

• Consider the discrete control spaces 

U h := {u = u0,...,uN−1 ∈ (R m ) N ; uh := max 

k 

|uk|}. 

• Denote this space by Uh 2 when endowed with the Hilbert norm 

N−1 1/2 

2 

uh,2 := k=0 hk(uk) . 

• Let yh [u] denote the solution of the discretized state equation, and set 

Jh (u) := φ(yh N [u]). Then pθ0 is the initial costate associated with the 

optimization problem 

Min 

u∈U hJh (u)−θδ h yN[u]; c(u) ≤ 0 (P h θ ) 

70

Connection with stability analysis in optimization II 

• If we can prove that θ ↦→ u θ is locally Lipschitz with constant O( ¯ h), the 

same holds for p θ . 

• By the theory of perturbed optimization (nonlinear programming), due 

to SOSC + qualification, the solution is indeed locally Lipschitz. 

• The Lipschitz constant is majorized by the supremum of norms of (right) 

directional derivatives. 

71

Connection with stability analysis in optimization II 

• If LIGA +strict complementarity + SOSC, the derivative of the solution 

of an NLP is obtained by applying the IFT to the optimality system. 

• The latter may be interpreted as the optimality system of a convex QP 

(quadratic problem) of the type (with obvious notations) 

Min 

v∈U h Ξ(v) := Ωh c(v)−δ h zN[v]; 

c ′ i (uθ k )vk ≤ 0, if i ∈ Iθ k , 

c ′ i (uθ k )vk = 0, if νθ ki > 0. 

• Without strict complementarity: it is still true that the directional 

derivative of the control is solution of the above problem. 

72

Connection with stability analysis in optimization III 

• SOSC: unique solution v, and Ξ(v) ≤ Ξ(0) = 0, gives the L 2 estimate 

αv 2 h,2 ≤ Ω h c(v) ≤ δ h zN[v] ≤ c ¯ hvh,2. 

Therefore, as was to be proved (here q is the linearized costate): 

vh,2 ≤ c ′¯ h ⇒ z∞ ≤ c ′′¯ h ⇒ q∞ ≤ c ′′¯ h ⇒ vh ≤ c (3)¯ h. 

• Note that we obtain an L 2 estimate first, then an L ∞ one. 

• Ref for computation of the directional derivative: Jittorntrum (1984). 

See also: book by Bonnans and Shapiro (2000). 

73

High-order RK schemes and control constraint 

• Essentially the same analysis holds. 

• We obtain that the discrete solution u h satisfies (with obvious notations) 

max 

k 

|u h k −ūt k | = O(|S h (¯p0)|). 

• If the RK scheme has an associated symplectic scheme of (global) order 

q, then denoting by ˆ h the biggest stepsize for the discrete junction, since 

the local error is always of order at least two, we have that 

|S h 

(¯p0)| = O ¯h q 

+ hˆ 2 

. 

74

High-order RK schemes and control constraint 

• Error 

|S h 

(¯p0)| = O ¯h q 

+ hˆ 2 

. 

• Therefore, when q > 2, we may expect that the error is concentrated 

near junction points. 

• Open question: how to refine the discretization (with as few points as 

possible) in order to reduce the stepsize at junction points. 

• Possible switching to a shooting algorithm if the structure of junctions 

points is identified. 

75

Abstract stability result (in view of state constrained problems) 

Consider the optimization problems, for i = 1,2: 

Min 

x∈X Fi(x); Ax+bi ∈ K, (14) 

where X is an Hilbert space, Y is a Banach space, A ∈ L(X,Y), bi ∈ Y, 

K is a convex subset of Y, and Fi are differentiable, with F1 “strongly 

convex over its feasible set” with modulus α > 0. 

Lemma 1. Let xi, i = 1,2 be solutions of the above problems. Let ˆx ∈ X 

be such that Aˆx = b2−b1. Then 

x2−x1 ≤ 1 

α DF1(x2− ˆx)−DF2(x2). (15) 

76

Proof. We have that ˜x := x2− ˆx satisfies 

F2(˜x+ ˆx) ≤ F2(x+ ˆx), for all x ∈ X such that Ax+b1 ∈ K. (16) 

By the first-order optimality condition 

DF2(˜x+ ˆx)(x1− ˜x) ≥ 0. (17) 

Since ˜x is feasible for the first problem we have that 

DF1(x1)(˜x−x1) ≥ 0. (18) 

Summing these inequalities and using x2 = ˜x+ ˆx, we get that 

(DF1(˜x)−DF1(x1)(˜x−x1) ≤ (DF1(˜x)−DF2(x2))(˜x−x1). (19) 

77

Since F1 is strongly coercive, it follows that 

α˜x−x1 2 ≤ DF1(˜x)−DF2(x2)˜x−x1 (20) 

which after simplification gives (15). 

78

STATE CONSTRAINED PROBLEMS (Dontchev, Hager 2001) 

• General state-constrained optimal control problem 

⎧ 

⎨ 

⎩ 

Min φ(yT); s.t. 

˙yt = f(ut,yt), t ∈ [0,T]; 

g(yt) ≤ 0, t ∈ [0,T]; y0 = y 0 . 

• Total derivative of state constraint along a trajectory (ū,¯y): 

g (1) (¯yt) := d 

dt g(¯yt) = g ′ (¯yt)f(ūt,¯yt) 

g (j) (¯yt) := d 

dt g(j−1) (¯yt) = g ′ (¯yt)f(ūt,¯yt) 

as long as g (j−1) does not depend on ūt. 

(P) 

79

Examples: high orders are natural 

• Control by the speed: first-order state constraint on position 

˙xt = ut; x ≥ 0. 

• Control by the acceleration: second-order state constraint on position 

¨xt = ut; x ≥ 0. 

• Acceleration provided by an electric device with second-order dynamics: 

fourth-order state constraint on position 

xt = u (4) 

t ; x ≥ 0. 

80

Important features 

• We will not discuss the case of Hamiltonian linear w.r.t. the control. 

and restric the study to the case of continuous control 

• Interior, boundary arc: entry, exit points; isolated contact points 

• Scalar state constraint: junction behavior strongly depends on the order 

of the state constraint. 

• Order 1 and 2: we expect discontinuous derivative of constraint at 

junction points 

• Order 3 and more: no natural example of junction between interior and 

boundary arc (cf the Robbins example). 

81

Academic example: first-order state constraint 

with 

Min 

1 

0 

1 

2 u2 (t)+g(t)y(t) dt 

s.t. ˙y(t) = u(t), y(0) = y(1) = 0, y(t) ≥ h 

g(t) := (c−sin(αt))g0, c > 0, α > 0. 

Time viewed as second state variable (˙τ = 1) 

µ = (h−h0)/(h1−h0) homotopy parameter 

h0 = min¯y(t), where ¯y is the solution of unconstrained problem 

h1 = h target value; numerical values are 

g0 := 10, α = 10π, c = 0.1, h1 = −0.001. 

82

Academic example: Unconstrained solution 

0.01 

-0.01 

-0.03 

-0.05 

-0.07 

-0.09 

-0.11 

-0.13 

-0.15 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

k = 0 

83

Academic example: first boundary arc 

0.01 

-0.01 

-0.03 

-0.05 

-0.07 

-0.09 

-0.11 

-0.13 

-0.15 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

k = 1 

84

Academic example: two boundary arcs 

0.01 

-0.01 

-0.03 

-0.05 

-0.07 

-0.09 

-0.11 

-0.13 

-0.15 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

k = 2 

85

Academic example: three boundary arcs 

0.01 

-0.01 

-0.03 

-0.05 

-0.07 

-0.09 

-0.11 

-0.13 

-0.15 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

k = 3 

86

Three basic examples: 

• Min T 

0 u2 tdt; x (i) 

t = ut, i = 1 to 3. 

• Next three drawings taken from Audrey Hermant’s thesis: 

http://tel.archives-ouvertes.fr/tel-00348227 

87

First-order state constraint: Min T 

0 u2 tdt; ˙xt = ut. 

0.13 

0.11 

0.09 

0.07 

0.05 

0.03 

0.01 

-0.01 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

88

Second-order state constraint: Min T 

0 u2 tdt; ¨xt = ut. 

0.27 

0.23 

0.19 

0.15 

0.11 

0.07 

0.03 

-0.01 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

89

Third-order state constraint: Min T 

0 u2 tdt; x (3) 

t = ut. 

0.27 

0.23 

0.19 

0.15 

0.11 

0.07 

0.03 

-0.01 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

90

Optimality conditions 

Format: ⎧ 

⎨ Min φ(yT); s.t. 

˙yt = f(ut,yt), t ∈ [0,T]; 

⎩ 

g(yt) ≤ 0, t ∈ [0,T]; y0 = y0 . 

Costate equation along the trajectory (ū,¯y): 

PMP: H[p](u,y) = pf(u,y) 

−d¯pt = ¯ptfy(ūt,¯yt)dt+ 

g ′ i(¯yt)dµit. 

H[¯pt](ūt,¯yt) ≤ H[¯pt](u,¯yt), for all u ∈ R m . 

When continuous control, jumps of ¯p and µ linked by 

−[¯pt] = 

g ′ i(¯yt)[µit]. 

i 

i 

(P) 

91

In the sequel: we discuss only first-order state constraints 

• We assume that Huu[¯pt](ūt,¯yt) uniformly positive definite. 

• We assume the qualification condition, where I(t) is the set of active 

constraints: 

{g ′ i(¯yt)fu(ūt,¯yt)} i∈I(t) is linearly independent. 

• Then: µ and ¯p are Lipschitz (Hager 1979, extensions M. de Pinho, 

Shvartsman, Vinter; second-order: Hermant 2009; higher-order: FB 

2010) 

92

Optimality system: we set ν = ˙µ: 

Assuming the state constraint to be inactive at time T: 

˙¯yt = f(ūt,¯yt), t ∈ [0,T]; 

−˙¯pt = ¯ptfy(ūt,¯yt)+ 

g ′ i(¯yt)νit. 

0 = ¯ptfu(ūt,¯yt) 

g(¯yt) ≤ 0, νit ≥ 0; νitg(¯yt) = 0 a.e. 

¯y0 = y 0 ; ¯pT = φ ′ (¯yT). 

i 

93

Condition at junction point 

Case of scalar state constraint 

or equivalently 

and so 

0 = −˙¯ptfu+Huu˙u+continuous term 

0 = νtg ′ (¯yt)fu+Huu˙u+continuous term 

0 = [νt]g ′ (¯yt)fu+Huu[˙u] 

Assumption of finitely many boundary arcs only, [νt] = 0 at each 

junction time, and νt positive on the boundary arc. 

94

Discretized problem 

Minφ(yN) s.t. 

Optimality system 


g(yk) ≤ 0, k = 0,...,N −1, 

y0 = y 0 . 

pk = pk+1+hkpk+1fy(uk,yk)+νkg ′ (yk), 

0 = Hu[pk+1](uk,yk) 

g(yk) ≤ 0, νk ≥ 0; νkg(yk) = 0. 

y0 = y 0 ; pN = φ ′ (yN). 

95

Homotopy path For θ ∈ [0,1]: 

Minθ 

k

Structure of perturbation 

max 

k 

|δfk|+|δH k y| = O( ¯ h). 

We assume in the sequel that ¯ h/hk is uniformly bounded and the SOSC 

on an extended cone (as in the case of control constraints). 

Lemma 2. There exists M > 0 such that, for ε > 0 and ¯ h small enough, 

if (u θ ,y θ ,p θ ,ν θ ) is solution of the above optimality system in an L ∞ 

neighborhood of (û,ˆy, ˆp) of size ε, then 1 Lip(u θ )+ν θ ∞ ≤ M. 

1 The Lipschitz constant is, as expected, max{|u θ k − u θ k−1 |/h k−1; 1 ≤ k ≤ N}. 

97

Proof. We have that 

0 = pθ k+1fu(uθ k ,yθ k )+θδHk u − pθ kfu(uθ k−1 ,yθ k−1 )+θδHk−1 

 

u 

= (pθ k+1 −pθ k )fu(uθ k ,yθ k )+pθ k (fu(uθ k ,yθ k )−fu(u θ k−1 ,yθ k−1 ) 

+θ δHk u −δH k−1 

 

u . 

Dividing by hk and using the costate equation, we deduce that 

ν θ kg ′ (y θ k)fu(u θ k,y θ k) = p θfu(u k 

θ k ,yθ k )−fu(u θ k−1 ,yθ k−1 ) 

hk 

(21) 

+O(1). (22) 

Using |y θ k −yθ k−1 | = O(hk−1) and the mean-value theorem, we deduce that 

p θ k fu(u θ k−1 ,yθ k−1 ) = pθ k fu(u θ k−1 ,yθ k )+O(hk−1) 

= p θ k fu(u θ k ,yθ k )+(uθ k−1 −uθ k )⊤ F θ k +O(hk−1), 

(23) 

98

where 

Combining with (22) we get 

|F θ k −Huu[p θ k](u θ k,y θ k)| = O(ε). (24) 

ν θ kg ′ (y θ k)fu(u θ k,y θ k) = (uθ k −uθ k−1 )⊤ 

For small enough ε > 0, we have that Fθ k 

that 

hk 

F θ k +O(1). (25) 

is uniformly invertible. We deduce 

u θ k −u θ k−1 = hk(F θ k) −1 fu(u θ k,y θ k) ⊤ g ′ (y θ k) ⊤ (ν θ k) ⊤ +O(hk). (26) 

On the other hand, if νθ ki = 0, then gi(yθ k ) = 0 and so 

0 ≥ gi(y θ k+1)−gi(y θ k) = hkg ′ i(y θ k)f(u θ k,y θ k)+O(h 2 k), (27) 

99

and similarly 

0 ≥ gi(y θ k−1)−gi(y θ k) = −hk−1g ′ i(y θ k)f(u θ k−1,y θ k−1)+O(h 2 k−1). (28) 

Dividing these relations by hk and hk−1 resp., and adding them, we get 

0 ≥ g ′ i(y θ k)(f(u θ k,y θ k)−f(u θ k−1,y θ k−1))+O(hk +hk−1). (29) 

Since |y θ k −yθ k−1 | = O(hk−1) it follows that 

g ′ i(y θ k)(f(u θ k,y θ k)−f(u θ k−1,y θ k)) ≤ O(hk +hk−1) = O(hk). (30) 

By the mean value theorem, for some u θ,i 

k ∈ [uθ k ,uθ k−1 ]: 

g ′ i(y θ k)fu(u θ,i 

k ,yθ k)(u θ k −u θ k−1) ≤ O(hk). (31) 

100

Using (26) again we deduce that 

g ′ i(y θ k)fu(u θ,i 

k ,yθ k)(F θ k) −1 fu(u θ k,y θ k) ⊤ g ′ (y θ k) ⊤ (ν θ k) ⊤ ≤ O(1). (32) 

Let Ik = {i; gi(yk) = 0} be the set of active constraints at step k, denote 

by ¯ν θ k the restriction of νθ k to Ik, and set 

Mk := g ′ I k (y θ k)fu(u θ k,y θ k)(Huu[p θ k](u θ k,y θ k)) −1 fu(u θ k,y θ k) ⊤ g ′ I k (y θ k) ⊤ 

M ′ k := g′ I k (y θ k)fu(u θ,i 

k ,yθ k)(F θ k) −1 fu(u θ k,y θ k) ⊤ g ′ I k (y θ k) ⊤ . 

(33) 

since νθ ki = 0 if i ∈ Ik, it follows from (32) that M ′ k (¯νθ k )⊤ ≤ O(1) 

(componentwise), and hence, ¯ν θ kM′ k (¯νθ k )⊤ ≤ O(|¯ν θ k |). Since Mk −M ′ k = 

O(ε) uniformly over k, and ¯ν θ kMk(¯ν θ k )⊤ ≥ β|¯ν θ k |2 , we deduce that 

β|¯ν θ k| 2 ≤ ¯ν θ kM ′ k(¯ν θ k) ⊤ +O(ε|¯ν θ k| 2 ) ≤ O(|¯ν θ k|+ε|¯ν θ k| 2 ). (34) 

101

For ε > 0 small enough, we deduce that ¯ν θ k 

conclude with (26). 

is uniformly bounded, and we 

102

Sensitivity analysis along the path Ω h : Hessian of discrete Lagrangian. 

Minv 

Ω h δ (v) := Ωh (v)− 

k hkδH k yzk 

zk+1 = zk +hkf ′ k (vk,zk)−hkδfk 

g ′ i (yk)zk ≤ 0, i ∈ Iθ k 

g ′ i (yk)zk = 0, if νθ ki > 0. 

Aim: prove that (η derivative of ν) 

max 

k 

(|vk|+|zk|+|pk|+|ηk|) = O( ¯ h). 

With geometrical hypotheses: feasibility of ˆv such that ˆv∞ = O( ¯ h). 

Then Ω h δ (v) ≤ Ωh δ (ˆv) gives v∞ = O( ¯ h), implying z∞ = O( ¯ h). 

Qualification: 

k hkηk = O( ¯ h), and hence, p∞ = O( ¯ h). 

103

OPEN PROBLEMS 

• First-order state constraints: higher-order schemes ? 

Problem: Order loss in differential-algebraic systems 

Is it possible to design a second-order scheme ? 

• High-order state constraints 

Typically discontinuous multiplier and costate 

Sophisticated shooting appoaches based on the “alternative optimality 

system”: K. Malanowski, H. Maurer, A. Hermant, FB. 

Numerical analysis of direct approach totally open 

• Singular control: singular arcs, state constraints 

For problems with bound constraints on the control, see the theory 

of second-order optimality conditions and the shooting approach in S. 

Aronna’s PhD thesis. 

Numerical analysis of direct approach totally open even in the 

“unconstrained” case”. 

104

• Problems with distributed delays 

yt = y0+ 

t 

t−τ 

f(t,s,ys,us)ds 

Ubiquitous in economy, biology ... 

Numerical analysis of direct approach totally open even in the 

“unconstrained” case”. 

• CONCLUSION 

There are plenty of open problems of great generality. 

The theory of numerical analysis of optimal control problem is still in 

infancy. 

105

Sources 

• J.F. Bonnans, J. Laurent-Varin: Computation of order conditions for 

symplectic partitioned Runge-Kutta schemes with application to optimal 

control. Num. Math. 103 (2006), 1–10. 

• J.F. Bonnans and A. Shapiro: Perturbation analysis of optimization 

problems. Springer, New York, 2000. 

• A.L. Dontchev, W.W. Hager: The Euler approximation in state constrained 

optimal control. Math. Comp. 70 (2001), 173–203. 

• A.L. Dontchev, W.W. Hager, K. Malanowski: Error bounds for Euler 

approximationofastateandcontrolconstrainedoptimalcontrolproblem. 

Numer. Funct. Anal. Optim. 21 (2000), 653–682. 

• A.L. Dontchev, W.W. Hager, V.M. Veliov: Second-order Runge-Kutta 

approximations in control constrained optimal control. SIAM J. Numer. 

Anal. 38 (2000) 202–226. 

106

• W. Hager: Lipschitz continuity for constrained processes. SIAM J. 

Control Optim. 17 (1979), 321–338. 

• W. Hager: Runge-Kutta methods in optimal control and the transformed 

adjoint system, Num. Math. 87 (2000), 247–282. 

• K. Jittorntrum: Solution point differentiability without strict 

complementarity in nonlinear programming. Math. Programming 21 

(1984), 127-138. 

• A.A. Milyutin, N.P. Osmolovskĭı: Calculus of Variations and Optimal 

Control. A.M.S., 1998. 

• V.M. Veliov: Error analysis of discrete approximations to bang-bang 

optimal control problems: the linear case. Control Cyb. 34 (2005), 

967–982. 

107

The End ! 

108

Numerical analysis of time discretization of optimal control problems

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?