Adaptative high-gain extended Kalman filter and applications

tel-00559107, version 1 - 24 Jan 2011 

Ph.D.-FSTC-10-2010 

Faculté des Sciences, de la Technologie 

et de la Communication 

THÈSE 

Présentée le 30/04/2010 à Luxembourg 

en vue de l’obtention du grade académique de 

UFR 907 Sciences et Techniques 

Le2i, Laboratoire, Electronique, Informatique 

et Image, UMR CNRS 5158 

DOCTEUR DE L’UNIVERSITÉ DU LUXEMBOURG 

EN INFORMATIQUE 

DOCTEUR DE L’UNIVERSITÉ DE BOURGOGNE 

EN INFORMATIQUE 

Spécialité: Automatique 

par 

Nicolas BOIZOT 

Né le 14 Juillet 1977 à Dijon, France 

ADAPTIVE HIGH-GAIN EXTENDED KALMAN FILTER 

Jury 

Dr Jean-Paul Gauthier, Président 

Professeur, Université de Toulon 

Dr Eric Busvelle, Directeur de thèse 

Professeur, Université de Bourgogne 

Dr Jürgen Sachau, Directeur de thèse 

Professeur, Université du Luxembourg 

Dr Jean-Claude Vivalda, Rapporteur 

Directeur de Recherches, INRIA Metz 

Dr Gildas Besançon, Rapporteur 

Maître de conférences HDR, INP Grenoble 

Dr Martin Schlichenmaier, Examinateur 

Professeur, Université du Luxembourg 

AND APPLICATIONS

tel-00559107, version 1 - 24 Jan 2011


Qu’est-ce qui nous prend, de mourir comme des cons, 

nous qui pourtant savions vivre † ? 

à Richard Boizot 

Ce jour là, j’ai bien cru tenir quelque chose et que ma vie 

s’en trouverait changée. 

Mais rien de cette nature n’est définitivement acquis. 

Comme une eau, le monde vous traverse et pour un temps 

vous prête ses couleurs. 

Puis se retire, et vous replace devant ce vide que l’on porte en soi, 

devant cette espèce d’insuffisance centrale de l’âme 

qu’il faut bien apprendre à côtoyer, à combattre, 

et qui, paradoxalement est peut-être 

notre moteur le plus sûr ‡ . 

† Frederic Dard à propos de Michel Audiard, Libération, 30 Juillet 1985. 

‡ Nicolas Bouvier, L’usage du monde. 

à Joëlle Boizot

tel-00559107, version 1 - 24 Jan 2011


Remerciements 

Je remercie tout d’abord monsieur Jean-Paul Gauthier pour avoir accepté la 

présidence de ce jury de thèse, pour ses intuitions extrêmement pertinentes et 

pour m’avoir transmis le goût de la théorie du contrôle il y a de ça quelques 

années. 

Monsieur Gildas Besançon a accepté le rôle de rapporteur de ce manuscrit. Lui 

présenter ce travail me tenait particulièrement à coeur. Les échanges que nous 

avons eu, alors que je n’en étais qu’au tout début m’ont encouragé à croire en 

l’approche adaptative, à explorer l’aspect temps réel du problème. 

Monsieur Jean-Claude Vivalda m’a fait l’honneur d’être rapporteur de mon travail. 

J’ai beaucoup apprécié l’acuité de sa relecture et de ses questions portant 

sur ce lemme avec lequel je me suis battu pendant la rédaction pensant, naïf, que 

ce ne serait qu’une bagatelle. 

I’m thankful to mister Martin Schlichenmaier for being part of the jury. I remember 

that it took us ages to organize a talk at the doctorands seminar of the 

Mathematics Research Unit: although I don’t know if the audience did, I learned 

a lot that day. 

Je remercie monsieur Jürgen Sachau d’avoir coencadré ce travail, de m’avoir accueilli 

au sein de son équipe, de m’avoir offert beaucoup d’autonomie et surtout 

de m’avoir soutenu dans mes choix. 

Je te suis reconnaissant, Eric, de.... plein de choses en fait. Tout d’abord, de 

m’avoir proposé ce beau sujet, de m’avoir instillé un peu de tes qualités scientifiques, 

pour ton humanité, pour ta façon à motiver les troupes. Pour le Poissonscorpion. 

Merci à M. Ulrich Sorger pour avoir suivi l’évolution de ce travail, à Pascal Bouvry, 

Nicolas Guelfy et Thomas Engel pour m’avoir accueilli avec bienveillance dans leur 

unité (CSC) et laboratoires respectifs (LASSY et Com.Sys). 

M. Massimo Malvetti a été d’une rare gentillesse et a beaucoup contribué à ce 

que la cotutelle soit mise en place dans de bonnes conditions. 

Merci à M. Jean-Regis Hadji-Minaglou de s’être demandé ce que je pouvais fiche 

dans le labo et d’avoir écouté la (longue) réponse. Merci encore pour nos discussions 

du midi et pour la Bonne Adresse. 

Nos secrétaires font preuve de patience pour gérer le flou artistique qui caractérise 

souvent une université, je dois avouer en avoir eu ma part. Merci à Ragga, 

Fabienne, Denise, Danièle, Steffie, Virginie et tout particulièrement à Mireille


Kies, témoin de cette période qui m’a vu m’exercer au violon après les heures de 

bureau. 

Tout le monde n’a pas la chance de converser avec une incarnation de Nick 

Tosches. C’est systématique, je quitte Alexandre Ewen avec la furieuse envie de 

lire à nouveau Héros oubliés du rock’n roll ou Confessions d’un chasseur d’opium. 

Les amis, la famille m’ont aussi apporté beaucoup de soutien dans cette histoire, 

qu’il me soit permis de les remercier ici. 

A tout seigneur tout honneur. 

Mult¸umesc Laurici ¸si Kenneth. Vous avez quelque part initié le mouvement, fait 

route avec moi et même aidé à manier les rames. 

Ma trinité se décompose en M’sieur Alex, M’sieur Alain et M’sieur Davy. Merci 

pour vos conseils avisés quand mes pensées arpentaient un ruban de Moebius. 

Je souris à mes soeurs Hélène et Marie-Pierre. Je n’oublie pas la belle Emma, 

ses geôliers et ses vénérables ancêtres Monique et Christian, ainsi que Serge qui a 

troqué le soleil de la drôme pour la bise du Luxembourg le temps d’un week-end. 

Parlant de famille, je pense de suite aux Arpendiens. Sans être un inconditionnel 

de la série, je ne vois d’autre choix que de suivre un format de type friends. 

Merci à 

– Celui qui, de la perfide Albion, passait sous le manteau des articles traitant de 

cinétique chimique – Celui qui enseignait la théorie de la relaxation par les chats 

– Celui qui savait démontrer que la vie est toujours plus chaotique que ce que l’on 

pense – Celui qui discutait jusqu’à deux-trois heures du matin même si le train 

pour Auxerre est à huit – Celui qui parlait peu, mais recevait comme un prince, 

sous le regard d’un Lion – Celui dont l’humour était dévastateur – Celui qui se la 

dorait au soleil – 

Merci Céline de parler aux types qui jonglent aux arrêts de bus et pour ton amitié. 

Paul, Dylan et Prisque Busvelle forment un comité d’accueil d’une qualité à forger 

les légendes. 

Merci à la danseuse aux yeux aigue-marine, aux chanteurs, au happy living room 

band, aux mentors violonistiques, bouilleurs de cru distillant l’enthousiasme. 

Il n’y a qu’un authentique Spritz pour effacer tout souvenir d’une campagne 

simulations désastreuse. Merci Maria. 

Last but so far from least. Anne-Marie, dame de l’été aux reflets de jais, Amandine, 

dame de l’hiver aux reflets d’or et Pierrick, le géant modéhen forment le 

gang de la rue de Neudorf. Une équipe dont on peut douter de l’existence quand 

on sait qu’on lui demande de vivre avec quelqu’un qui passe une partie de son 

temps dans les astres, n’écoute pas quand lui parle et crincrinte usuellement entre 

vingt-deux heures et minuit. 

Me croirez-vous ? 

Ils existent !


Abstract 

Keywords - nonlinear observers, nonlinear systems, extended Kalman filter, 

adaptive high-gain observer, Riccati equation, continuous-discrete observer, DCmotor, 

real-time implementation. 

The work concerns the “observability problem” — the reconstruction of a dynamic 

process’s full state from a partially measured state— for nonlinear dynamic systems. 

The Extended Kalman Filter (EKF) is a widely-used observer for such 

nonlinear systems. However it suffers from a lack of theoretical justifications and 

displays poor performance when the estimated state is far from the real state, e.g. 

due to large perturbations, a poor initial state estimate, etc. . . 

We propose a solution to these problems, the Adaptive High-Gain (EKF). 

Observability theory reveals the existence of special representations characterizing 

nonlinear systems having the observability property. Such representations 

are called observability normal forms. A EKF variant based on the usage of a 

single scalar parameter, combined with an observability normal form, leads to an 

observer, the High-Gain EKF, with improved performance when the estimated 

state is far from the actual state. Its convergence for any initial estimated state 

is proven. Unfortunately, and contrary to the EKF, this latter observer is very 

sensitive to measurement noise. 

Our observer combines the behaviors of the EKF and of the high-gain EKF. Our 

aim is to take advantage of both efficiency with respect to noise smoothing and 

reactivity to large estimation errors. In order to achieve this, the parameter that 

is the heart of the high-gain technique is made adaptive. Voila, the Adaptive 

High-Gain EKF. 

A measure of the quality of the estimation is needed in order to drive the adaptation. 

We propose such an index and prove the relevance of its usage. We provide a 

proof of convergence for the resulting observer, and the final algorithm is demonstrated 

via both simulations and a real-time implementation. Finally, extensions 

to multiple output and to continuous-discrete systems are given.

tel-00559107, version 1 - 24 Jan 2011


Résumé 

Mots clefs - observateurs non linéaires, systèmes non linéaires, filtre de Kalman 

étendu, observateur à grand-gain adaptatif, équation de Riccati, observateur continu/discrèt, 

moteur DC, implémentation temps réel. 

Le travail porte sur la problématique de “l’observation des systèmes” — la reconstruction 

de l’état complet d’un système dynamique à partir d’une mesure 

partielle de cet état. Nous considérons spécifiquement les systèmes non linéaires. 

Le filtre de Kalman étendu (EKF) est l’un des observateurs les plus utilisés à 

cette fin. Il souffre cependant d’une performance moindre lorsque l’état estimé 

n’est pas dans un voisinage de l’état réel. La convergence de l’observateur dans 

ce cas n’est pas prouvée. 

Nous proposons une solution à ce problème : l’EKF à grand gain adaptatif. 

La théorie de l’observabilité fait apparaître l’existence de représentations caractérisant 

les systèmes dit observables. C’est la forme normale d’observabilité. 

L’EKF à grand gain est une variante de l’EKF que l’on construit à base d’un 

paramètre scalaire. La convergence de cet observateur pour un système sous sa 

forme normale d’observabilité est démontrée pour toute erreur d’estimation initiale. 

Cependant, contrairement à l’EKF, cet algorithme est très sensible au bruit 

de mesure. 

Notre objectif est de combiner l’efficacité de l’EKF en termes de lissage du bruit, et 

la réactivité de l’EKF grand-gain face aux erreurs d’estimation. Afin de parvenir 

à ce résultat nous rendons adaptatif le paramètre central de la méthode grand 

gain. Ainsi est constitué l’EKF à grand gain adaptatif. 

Le processus d’adaptation doit être guidé par une mesure de la qualité de l’estimation. 

Nous proposons un tel indice et prouvons sa pertinence. Nous établissons une 

preuve de la convergence de notre observateur, puis nous l’illustrons à l’aide d’une 

série de simulations ainsi qu’une implémentation en temps réel dur. Enfin nous 

proposons des extensions au résultat initial : dans le cas de systèmes multi-sorties 

et dans le cas continu-discret.

tel-00559107, version 1 - 24 Jan 2011


Résumé Étendu 

Dans ce travail nous abordons le problème de la synthèse d’observateurs pour 

les systèmes non linéaires. 

Un système de ce type est défini par un ensemble d’équations de la forme: 

où 

− t est la variable de temps, 

� dx(t) 

dt = f(x(t),u(t),t) 

y(t) = h(x(t),u(t),t) 

− x(t) est la variable d’état, de dimension n, 

− u(t) est l’entrée du système, ou variable de contrôle, de dimension nu, 

− y(t) est la sortie, ou mesure, de dimension ny, 

− f, h, sont des applications dont au moins une est non linéaire. 

Un observateur est un algorithme qui assure la reconstruction, ou estimation, 

de la variable x(t) sur la base de données partielles : la mesure y(t) et l’entrée 

u(t). Cet état estimé sera par la suite utilisé à des fins de contrôle du système, 

ou de supervision comme illustré en Figure 1, où z(t) désigne l’état estimé. 

+ 

u(t) 

Controleur 

Consigne 

Système 

x(t) 

y(t) 

Figure 1: Boucle de contrôle 

Observateur 

z(t) 

Etat 

estimé 

La mise au point et l’étude d’observateurs peuvent être décomposées en trois 

sous-problèmes :


− Le problème d’observabilité se penche sur les équations constituant le système. 

Ce modèle permet-il la reconstruction effective de l’état à partir de ses 

mesures ? 

− Le problème de convergence se concentre sur l’observateur lui-même. L’état 

estimé converge-t-il vers l’état réel, et à quelle vitesse (i.e. exponentielle, 

polynomiale...) ? 

− Le problème de la fermeture de la boucle de contrôle s’intéresse à la stabilité 

d’un algorithme de contrôle lorsque celui-ci s’appuie sur l’estimation que lui 

fournit un observateur. 

Ces trois problèmes, bien connus dans le cas des systèmes linéaires, font l’objet 

d’une intense activité de recherche pour ce qui est des systèmes non linéaires. 

Le texte que nous présentons porte sur la mise au point d’un observateur et la 

résolution du second des problèmes énoncés plus tôt qui lui est associé. 

Motivations 

Notre observateur prototype est le filtre de Kalman étendu (EKF) pour lequel 

[43, 58, 68] constituent de bonnes entrées en matière. Travailler avec l’EKF est 

particulièrement intéressant car 

− il est très répandu auprès des ingénieurs, 

− sa mise en oeuvre pratique se fait de manière très naturelle, 

− il est particulièrement adapté à une utilisation en temps réel, 

− il possède de bonnes performance de lissage du bruit de mesure [101]. 

Un problème connu de ce filtre est son manque de garanties de stabilité. Sa 

convergence 1 n’est prouvée que localement, c’est-à-dire lorsque l’état estimé se 

trouve dans un voisinage de l’état réel. 

Une garantie de convergence globale peut être obtenue en ayant recours à la 

méthodologie grand gain telle que proposée par Gauthier et al. [47, 54, 57]. Cette 

approche repose sur deux composantes: 

1. l’usage d’une représentation du système non linéaire considéré, caractéristique 

de l’observabilité 2 du système, 

2. une modification de l’algorithme EKF reposant sur l’usage d’un unique 

paramètre scalaire que l’on dénote usuellement θ. 

L’observateur obtenu est appelé filtre de Kalman étendu grand gain (HGEKF). 

Pourvu que le paramètre θ soit fixé à une valeur assez grande, alors cet algorithme 

est convergent quelle que soit l’erreur d’estimation initiale. Il est de cette manière 

1 

On entend par convergence de l’observateur le fait que l’état estimé tende asymptotiquement vers l’état 

réel. 

2L’observabilité 

est une propriété (intrinsèque) d’un système indiquant qu’il est possible de distinguer deux 

état distinct à partir des sorties correspondantes.


robuste aux grandes perturbations que le système pourrait essuyer (par exemple 

un changement soudain de la variable d’état dû à une erreur de fonctionnement 

comme dans [64]). Ce paramètre permet de plus de régler la vitesse de convergence 

: plus il est choisi grand et plus la convergence est rapide. Cependant, comme θ 

est multiplié à y, la variable de sortie, il en amplifie d’autant le bruit de mesure. 

Il arrive donc que pour un signal de sortie fortement bruité il soit impossible 

d’utiliser efficacement le HGEKF. 

Notre objectif est de proposer un observateur de type filtre de Kalman réunissant 

les avantages de l’EKF et du HGEKF. Nous allons faire appel à la structure 

adéquate en fonction de nos besoins. Pour ce faire, nous devons: 

1. proposer une manière d’estimer à quel moment il est nécessaire de changer 

la configuration de l’observateur, 

2. proposer un mécanisme d’adaptation, 

3. prouver la convergence globale de l’observateur, montrer que celle-ci peut 

être réalisée en un temps arbitrairement court. 

Forme normale d’observabilité entrées multiples, simple 

sortie 

Nous situons notre approche dans le cadre de la théorie de l’observabilité 

déterministe de Gauthier et al., présentée dans [57] et [40]. Une rapide revue des 

principales définitions et résultats de cette théorie est faite dans le Chapitre 2 de 

cette thèse. Nous nous contentons ici de définir la forme normale d’observabilité 

pour les systèmes à entrées multiples et simple sortie (MISO). Ce choix est fait 

de sorte à ce que l’exposé conserve toute sa clarté. 

L’observateur, ainsi que les notions que nous présentons, restent valable pour 

les systèmes à sorties multiples (MIMO) pourvu que l’algorithme soit adapté en 

conséquence. En effet, contrairement aux systèmes MISO, pour qui la forme normale 

est essentiellement unique, il existe plusieurs formes d’observabilité MIMO. 

La description de l’observateur pour une représentation à sorties multiples est 

donnée au Chapitre 5 de ce texte. 

Nous considérons les systèmes de la forme: 

� 

dx 

dt 

y 

= 

= 

A (u) x + b (x, u) 

C (u) x 

où x (t) ∈ X ⊂ R n , X compact, y (t) ∈ R, et u(t) ∈ Uadm ⊂ R nu est borné pour 

tout t ≥ 0. 

(1)


Les matrices A (u) et C (u) sont définies par: 

⎛ 

0 

⎜ 

A(u) = ⎜ 

. 

⎝ 

a2 (u) 

0 

0 

a3 (u) 

. .. 

· · · 

. .. 

. .. 

0 

⎞ 

0 

⎟ 

. ⎟ 

0 ⎟ , 

⎟ 

an (u) ⎠ 

0 · · · 0 

C (u) = � a1 (u) 0 · · · 0 � , 

où 0


Nous définissons les matrices 

et 

⎛ 

1 0 · · · 0 

⎜ 

∆ = ⎜ 

0 

⎜ 

⎝ . 

1 

θ 

. .. 

. .. 

. .. 

. 

0 

0 · · · 0 1 

θn−1 ⎞ 

⎟ 

⎠ , 

Qθ = θ∆ −1 Q∆ −1 , 

Rθ = θ −1 R. 

Le filtre de Kalman étendu à grand gain adaptatif est le système: 

⎧ 

⎪⎨ 

⎪⎩ 

dz 

dt = A(u)z + b(z, u) − S−1C ′ R −1 

dS 

dt 

θ (Cz − y(t)) 

= −(A(u)+b∗ (z, u)) ′ S − S(A (u)+b∗ (z, u)) + C ′ R −1 

dθ 

dt 

θ C − SQθS 

= F(θ, Id (t)) 

avec pour conditions initiales : z(0) ∈ χ, S(0) symétrique définie positive, et 

θ(0) = 1. La fonction F constitue le mécanisme d’adaptation de l’observateur. 

Nous la définissons par ses propriétés, récapitulées au Lemme V. 

Id nous sert à estimer la qualité de l’estimation rendue par l’observateur. Cette 

quantité est calculée comme suit: 

pour une “longueur d’horizon” d>0, l’innovation est: 

� t 

Id (t) = 

t−d 

�y (t − d, x (t − d) , τ) − y (t − d, z (t − d) , τ)� 2 dτ (3) 

où y (t0,x0, τ) est la sortie du système (1) au temps τ avec pour condition initiale 

x (t0) =x0. 

Par conséquent y (t − d, x (t − d) , τ) n’est autre que y(τ), la sortie du système 

(1). Nous insistons sur le fait que y (t − d, z (t − d) , τ) n’est pas la sortie de 

l’observateur. 

L’importance de cette quantité est explicitée par le lemme suivant. Il apparaît 

aussi dans la preuve de convergence du Théorème IV (voir la preuve du Théorème 

36 au Chapitre 3) que ce résultat est la pierre angulaire du mécanisme d’adaptation. 

Lemme II 

Soient x0 1 ,x02 ∈ Rn , et u ∈ Uadm. y � 0,x0 1 , ·� et y � 0,x0 2 , ·� sont les trajectoires de 

sortie du système (1) avec conditions initiales x0 1 et x02 , respectivement. Alors la 

propriété suivante (dite “observabilité persistante”) est vraie : 

∀d >0, ∃λ 0 d > 0 tel que ∀u ∈ L1 b (Uadm) 

(2)


Remarque III 

�x 0 

1 − x 0 

2� 2 ≤ 1 

λ 0 d 

� d 

�y � 0,x 0 1, τ � − y � 0,x 0 2, τ � � 2 dτ. (4) 

0 

1. Si l’on considère que x0 1 = z(t − d), et que x02 = x(t − d) alors le Lemme II 

nous indique que : 

�z(t − d) − x(t − d)� 2 ≤ 1 

λ0 � t 

�y (τ) − y (t − d, z (t − d) , τ)� 

d t−d 

2 dτ, 

ou, de manière équivalente, 

�z(t − d) − x(t − d)� 2 ≤ 1 

λ0 Id(t). 

d 

c’est-à-dire que modulo la multiplication par un paramètre constant, l’innovation 

au temps t est borne supérieure de l’erreur d’estimation au temps t − d. 

2. L’innovation est définie de manière plus détaillée dans la Section 3.3 du 

Chapitre 3. Son implémentation est explicitée au Chapitre 4. 

Le théorème suivant est le résultat au coeur de cette thèse : le lemme précédent 

le sert, les différentes applications et extensions en découlent. 

Théorème IV 

Pour tout temps arbitraire T ∗ > 0 et tout ε ∗ > 0, il existe 0 0, il existe une constante M(∆T ) telle que : 

− pour tout θ1 > 1, et 

− tout couple γ1 ≥ γ0 > 0, 

il existe une fonction F (θ, I) de sorte que l’équation 

˙θ = F (θ, I (t)) , (5) 

où 1 ≤ θ (0) < 2θ1, et I (t) est une fonction positive et mesurable, a les propriétés 

suivantes: 

1. il existe une unique solution θ (t), vérifiant 1 ≤ θ (t) < 2θ1, pour tout t ≥ 0, 

� 

� 

2. � F(θ,I) 

� 

� 

� ≤ M, 

θ 2 

3. si I (t) ≥ γ1 pour t ∈ [τ,τ + ∆T ] alors θ (τ + ∆T ) ≥ θ1, 

4. tant que I (t) ≤ γ0, θ (t) décroît vers 1.


Résultats de simulation et implémentation temps réel 

La mise en oeuvre pratique de l’observateur pour un moteur à courant continu 

et connecté en série est décrite en détail au Chapitre Chapitre 4. Le modèle est 

obtenu en réalisant d’une part l’équilibre des forces électriques – i.e. à partir 

de la représentation sous forme de circuit électrique, voir Figure 2 – et d’autre 

part l’équilibre mécanique – i.e. loi de Newton. L’état du système obtenu est de 

dimension 2, l’entrée et la sortie sont de dimension 1: 

⎧ � � � 

⎨ LI˙ u − RI − Laf ωrI 

= 

J ˙ωr 

Laf I 

⎩ 

2 � 

− Bωr − Tl 

(6) 

y = I 

où 

− I et ωr sont les variables d’état : l’intensité du courant et la vitesse de 

rotation respectivement, 

− u(t) est la quantité de courant en entrée, 

− R est la somme des résistances du circuit, 

− L est la somme des inductances du circuit, 

− Laf est l’inductance mutuelle, 

− J est l’inertie du système, 

− B est le coefficient de frottement de l’axe du moteur, et 

− Tl est le couple de charge. 

Rf 

Lf 

La 

Ra 

C 

A 

+ 

- B 

Figure 2: Moteur DC : circuit équivalent 

La mise en oeuvre de l’observateur requiert trois étapes : 

− l’analyse d’observabilité et la mise en lumière de la forme canonique, 

− la définition de la fonction d’adaptation, 

− le réglage des différents paramètres. 

Observabilité 

L’analyse d’observabilité par la méthode différentielle (Cf. Section 4.1.2) montre 

que si I ne s’annule pas, alors ce système est observable 3 . Il est de plus possible, 

3 notez que si I est nul, cela veut dire qu’il n’y a pas de courant et que le moteur est soit à l’arrêt, soit en 

phase d’arrêt.


! 

"#+ 

"#$ 

"#* 

"#% 

"#) 

"#& 

"#( 

"#' 

"#! 

! , )- . ," 

! , !"- . ," 

! , '"- . ," 

! , '"- . ,"#& 

! , '"- . ,"#$ 

" 

!! !"#$ !"#% !"#& !"#' " "#' "#& "#% "#$ ! 

Figure 3: Influence de β et m sur la forme de la sigmoïde. 

tout en conservant l’observabilité, d’estimer Tl, qui est une perturbation non 

mesurée. Pour ce faire, le modèle est étendu à trois équations en utilisant le 

modèle trivial : Tl 

˙ = 0. 

Le changement de coordonnées: 

x2 ˙ 

x3 ˙ 

R ∗+ × R × R → R ∗+ × R × R 

(I,ω r,Tl) ↩→ (x1,x2,x3) =(I, Iωr, ITl) 

transforme le système en la forme normale d’observabilité 

⎛ 

x1 ˙ 

⎝ 

⎞ ⎛ 

0 

⎠ = ⎝ 0 

Laf 

− L 

0 

0 

− 1 

0 0 

J 

0 

⎞ ⎛ 

x1 

⎠ ⎝ x2 

⎞ ⎛ 

⎠+ 

⎜ 

⎝ − B 

J x2 + Laf 

La fonction d’adaptation 

x3 

− Laf 

L 

1 

L 

J x3 1 

(u(t) − Rx1) 

Laf x 

− 2 2 u(t) 

+ 

x2x3 

x1 

L x1 

u(t) x3 + L x1 

x2 

L x1 

R − L x3 

L’observateur présenté dans la section précédente est complété en exprimant explicitement 

sa fonction d’adaptation: 

où 

− F0(θ) = 

F(θ, Id) =µ(Id)F0(θ) + (1 − µ(Id))λ(1 − θ) (8) 

� 1 

∆T θ2 pourθ ≤ θ1 

1 

∆T (θ − 2θ1) 2 pour θ >θ 1 

− µ(I) = � 1+e −β(I−m)� −1 est une fonction sigmoïde paramétrée par β et m 

(Cf. Figure 3). 

, 

− R 

L x2 

(7) 

⎞ 

⎟ 

⎠ .


Réglage des paramètres 

La procédure d’adaptation implique l’usage de nombreux paramètres de réglages. 

La première impression est celle d’une certaine confusion face aux choix à réaliser. 

Il est en fait possible de régler tous ces paramètres un à un, selon un ordre logique. 

Cet ordre est illustré par la Figure 4. La procédure complète est détaillée dans la 

Section 4.2.3, nous en donnons un aperçu ci-dessous. 

L’ensemble des paramètres se divise en deux groupes : les paramètres relatifs à 

la performance de chacun des deux modes de l’observateur, et ceux relatifs à la 

procédure d’adaptation. 

Paramètres hors 

adaptation 

⎧ 

⎨ 

⎩ 

R 

Q 

θ1 

Etape 

Paramètres de performance 

⎧ 

⎪⎨ 

⎪⎩ 

Paramètres de la 

procédure d'adaptation 

β 

λ 

∆T 

d 

m1 

� δ 

m2 

Etape 2 Etape 3 

θ1 

Figure 4: Gras : paramètres cruciaux. 

Tout l’intérêt de l’approche adaptative que nous développons est de pouvoir 

découpler l’influence de Q et R d’un côté, et de θ de l’autre, sur la performance 

de l’observateur. Dans un filtre de Kalman étendu à grand gain, les matrices Q 

et R perdent leur sens premier car elles sont, de manière définitive, changées à 

cause de leur multiplication par θ. Lorsque l’observateur n’est pas en mode grand 

gain, ces deux matrices retrouvent leur sens classique. Il faut donc les régler de 

sorte à ce que le bruit de mesure soit filtré de manière optimale. Nous pouvons 

pour cela faire appel à des méthodes classiques [58]. 

Il en va de même pour θ, le paramètre de grand gain : seule sa performance de 

convergence nous intéresse. θ est choisi comme il est habituel de le faire pour 

les observateurs de type grand gain. On prendra toutefois soin de limiter les 

“overshoots” au maximum. 

Paramètres d’adaptation 

Parmi l’ensemble des paramètres d’adaptation seuls d, la longueur de la fenêtre 

de calcul de l’innovation et m, le second paramètre de la sigmoïde, doivent être 

modifiés d’un observateur à l’autre. Les autres pourront être fixés une fois pour 

toutes quel que soit le procédé étudié.


Il faut être conscient de l’influence du bruit de mesure sur le calcul de l’innovation. 

En effet, lorsque l’observateur estime parfaitement l’état du système le calcul de 

l’innovation n’est constitué que de l’intégration du bruit de mesure. Conséquemment 

si la procédure d’adaptation devait être déclenchée dès que l’innovation est non 

nulle alors θ serait toujours grand. La sigmoïde est décalée sur la droite par 

l’usage du paramètre m. Nous proposons de calculer m à partir d’une estimation 

de la déviation standard du bruit de mesure. Une méthode de calcul efficace est 

donnée en Section 4.2.3. 

Si d est trop petit alors l’information contenue dans l’innovation sera noyée dans le 

bruit. Si d est au contraire trop grand alors le temps de calcul devient rédhibitoire. 

Démonstration de performance 

Les Figures 5, 6, 7 donnent un aperçu de la performance de l’observateur lors d’un 

scénario simple. Ces figures sont commentées en Section 4.2.4, où les résultats 

donnés par un scénario plus complexe sont aussi montrés. Une série de courbes 

renseigne sur les effets d’un mauvais réglage du paramètre m. 

On remarquera que le comportement hybride recherché, lissage du bruit équivalent 

à celui d’un filtre de Kalman étendu (courbe bleu foncé) et vitesse de convergence 

comparable à un filtre de Kalman étendu à grand gain (courbe bleu ciel), est 

atteint. 

+ # 

&!! 

%&! 

%!! 

$&! 

$!! 

#&! 

#!! 

Estimation via HGEKF 

Signal réel 

Estimation via EKF 

Estimation via AEKF 

"&! 

! "! #! $! %! &! 

temps 

'! (! )! *! "!! 

Figure 5: Estimation de la vitesse de rotation. 

La seconde moitié du Chapitre 4 est consacrée à la description détaillée de la 

programmation de l’observateur dans un environnement temps réel dur. Un tel


, $ 

m2 

$+& 

$ 

#+& 

# 

"+& 

" 

!+& 

! 

!!+& 

!" 

Estimation via HGEKF 

Signal réel 

Estimation via EKF 

Estimation via AEKF 

!"+& 

! "! #! $! %! &! 

temps 

'! (! )! *! "!! 

8 

6 

4 

2 

Figure 6: Estimation du couple de charge. 

0 

0 10 20 30 40 50 60 70 80 90 100 

8 

6 

4 

2 

Evolution de 

Innovation 

0 

0 10 20 30 40 50 60 70 80 90 100 

Figure 7: Grand gain et Innovation. 

!


système d’exploitation force l’utilisateur à programmer son algorithme de sorte à 

respecter les contraintes de temps de calcul qui lui sont imposées – i.e. les calculs 

non terminés sont interrompus. Nous y démontrons que l’observateur peut être 

utilisé dans un environnement contraignant, sur un moteur réel : il fonctionne de 

belle manière à une vitesse d’échantillonnage de 100 Hz. 

Extensions 

Pour conclure nous proposons deux extensions au théorème principal. 

La première concerne les systèmes à sorties multiples. Il n’existe pas de forme 

d’observabilité unique pour ce type de systèmes, ainsi l’observateur doit être 

adapté à chaque nouvelle forme rencontrée. Nous avons choisi une généralisation 

du système 1 pour laquelle nous avons décrit en détail les modifications à apporter. 

Ces modifications sont tout à fait cohérentes avec la définition originale 

de l’observateur dans le sens où si la sortie se résume à 1 variable alors la configuration 

redevient celle donnée initialement. Nous montrons en détails les points 

de la preuve conduisant à une modification de l’algorithme de sorte à inspirer de 

nouvelles structures pour systèmes à sorties multiples. 

La seconde extension porte sur les systèmes de type continus/discrets. Ces systèmes 

constituent une description du procédé plus proche de la réalité dans le sens où 

la dynamique est décrite de façon continue et les mesures sont modélisées par un 

processus discret. Nous simulons ainsi le système de capteurs de manière réaliste. 

Nous proposons un observateur adapté et montrons que l’une des hypothèses sur 

la fonction d’adaptation peut être relâchée. Ceci nous permet de concevoir l’usage 

de fonctions algébriques au lieu de fonctions différentielles, évacuant ainsi la question 

du retard dû au temps de montée du paramètre grand gain à dynamique 

continue. Pour mener à bien la preuve de convergence de cet observateur, nous 

avons été amenés à développer une preuve de l’existence de bornes pour l’équation 

de Riccati, qui était jusqu’ici absente de la littérature. 

En guise de conclusion, nous précisons que ce travail a permis la rédaction: 

− d’un article de journal, accepté pour publication dans Automatica [22], 

− d’un chapitre de livre, écrit à l’occasion de l’école d’été d’automatique de 

Grenoble 2007 [21], 

− de communications lors de trois conférences internationales [23–25], 

− d’une intervention pour un workshop Linux Realtime [26].


Contents 

List of Figures xxvi 

List of Tables xxvii 

Nomenclature xxx 

1 Introduction 1 

1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 

1.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 

1.4 Dissertation Outline and Comments to the Reader . . . . . . . . . . . . . . . 8 

2 Observability and High-gain Observers 10 

2.1 Systems and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.2 The Several Definitions of Observability . . . . . . . . . . . . . . . . . . . . . 12 

2.3 Observability Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.3.1 First case: ny >nu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.3.2 Second case: ny ≤ nu . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

2.4 Single Output Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

2.5 High-gain Observers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

2.6 On Adaptive High-gain Observers . . . . . . . . . . . . . . . . . . . . . . . . . 21 

2.6.1 E. Bullinger and F. Allőgower . . . . . . . . . . . . . . . . . . . . . . . 23 

2.6.2 L. Praly, P. Krishnamurthy and coworkers . . . . . . . . . . . . . . . . 24 

2.6.3 Ahrens and Khalil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 

2.6.4 Adaptive Kalman Filters . . . . . . . . . . . . . . . . . . . . . . . . . 29 

2.6.5 Boutayeb, Darouach and coworkers . . . . . . . . . . . . . . . . . . . . 32 

2.6.6 E. Busvelle and J-P. Gauthier . . . . . . . . . . . . . . . . . . . . . . . 35 

3 Adaptive High-gain Observer: Definition and Convergence 37 

3.1 Systems Under Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

3.2 Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

3.3 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

3.4 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

3.5 Preparation for the Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

3.6 Boundedness of the Riccati Matrix . . . . . . . . . . . . . . . . . . . . . . . . 47 

xxi


CONTENTS 

3.7 Technical Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

3.8 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 

3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

4 Illustrative Example and Hard real-time Implementation 56 

4.1 Modeling of the Series-connected DC Machine and Observability Normal Form 57 

4.1.1 Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

4.1.2 Observability Cannonical Form . . . . . . . . . . . . . . . . . . . . . . 59 

4.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

4.2.1 Full Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . . 60 

4.2.2 Implementation Considerations . . . . . . . . . . . . . . . . . . . . . 61 

4.2.3 Simulation Parameters and Observer Tuning . . . . . . . . . . . . . . 62 

4.2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

4.3 real-time Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 

4.3.1 Softwares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 

4.3.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 

4.3.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

4.3.4 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 

4.3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

5 Complements 92 

5.1 Multiple Inputs, Multiple Outputs Case . . . . . . . . . . . . . . . . . . . . . 93 

5.1.1 System Under Consideration . . . . . . . . . . . . . . . . . . . . . . . 93 

5.1.2 Definition of the Observer . . . . . . . . . . . . . . . . . . . . . . . . . 94 

5.1.3 Convergence and Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 95 

5.1.3.1 Lemma on Innovation . . . . . . . . . . . . . . . . . . . . . . 96 

5.1.3.2 Preparation for the Proof . . . . . . . . . . . . . . . . . . . . 98 

5.1.3.3 Intermediary Lemmas . . . . . . . . . . . . . . . . . . . . . . 100 

5.1.3.4 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . . 101 

5.2 Continuous-discrete Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 103 

5.2.1 System Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 

5.2.2 Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 

5.2.3 Convergence Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 

5.2.4 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 

5.2.5 Preparation for the Proof . . . . . . . . . . . . . . . . . . . . . . . . . 108 

5.2.6 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 

6 Conclusion and Perspectives 115 

Appendices 

A Mathematics Reminder 121 

A.1 Resolvent of a System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 

A.2 Weak-∗ Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 

xxii


CONTENTS 

A.3 Uniform Continuity of the Resolvant . . . . . . . . . . . . . . . . . . . . . . . 125 

A.4 Bounds on a Gramm Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 

B Proof of Lemmas 130 

B.1 Bounds on the Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . . . 131 

B.1.1 Part One: the Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . 133 

B.1.2 Part Two: the Lower Bound . . . . . . . . . . . . . . . . . . . . . . . 142 

B.2 Proofs of the Technical Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . 151 

C Source Code for Realtime Implementation 153 

C.1 Replacement Code for the File: rtai4 comedi datain.sci . . . . . . . . . . . . 154 

C.2 Computational Function for the Simulation of the DC Machine . . . . . . . . 155 

C.3 AEKF Computational Functions C Code . . . . . . . . . . . . . . . . . . . . . 157 

C.4 Innovation Computational Functions C Code . . . . . . . . . . . . . . . . . . 159 

C.5 Ornstein-Ulhenbeck Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 

References 165 

xxiii


xxiv 

CONTENTS


List of Figures 

1 Boucle de contrôle schématisée . . . . . . . . . . . . . . . . . . . . . . . . . . ix 

2 Moteur DC : circuit équivalent . . . . . . . . . . . . . . . . . . . . . . . . . . xv 

3 Graphe d’une sigmoïde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi 

4 Ordonnancement des paramètres . . . . . . . . . . . . . . . . . . . . . . . . . xvii 

5 Estimation de la vitesse de rotation. . . . . . . . . . . . . . . . . . . . . . . . xviii 

6 Estimation du couple de charge. . . . . . . . . . . . . . . . . . . . . . . . . . xix 

7 Grand gain et Innovation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 

1.1 Control Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 

1.2 Estimation Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

1.3 Estimation with noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2.1 Observability Equivalence Diagram. . . . . . . . . . . . . . . . . . . . . . . . 17 

2.2 A Switching Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

3.1 The Computation of Innovation. . . . . . . . . . . . . . . . . . . . . . . . . . 41 

4.1 Series-connected DC Motor: equivalent circuit representation. . . . . . . . . . 58 

4.2 Observer Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 

4.3 Observer Main Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 

4.4 Computation of Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 

4.5 Simulation Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 

4.6 Observer Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 

4.7 Tuning Q and R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 

4.8 Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 

4.9 Tuning of θ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 

4.10 Graph of the Sigmoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 

4.11 Simulation Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

4.12 Output Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 

4.13 Scenario 2: Speed Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 

4.14 Scenario 2: Torque Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 72 

4.15 Scenario 3: Speed Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 

4.16 Scenario 3: Torque Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 74 

4.17 Scenario 2: Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 

4.18 Scenario 3: Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

4.19 Effect of Parameter m2: Estimation . . . . . . . . . . . . . . . . . . . . . . . 75 

xxv


LIST OF FIGURES 

4.20 Effect of Parameter m2: Innovation . . . . . . . . . . . . . . . . . . . . . . . . 76 

4.21 Graphical Implementation of a real-time task. . . . . . . . . . . . . . . . . . . 78 

4.22 Compilation of a real-time task. . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

4.23 Xrtailab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

4.24 Connections Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 

4.25 The Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 

4.26 AEKF: Implementation Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 87 

4.27 Luenberger Observer in Real-time . . . . . . . . . . . . . . . . . . . . . . . . 90 

4.28 Kalman filter in Real-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 

4.29 AEKF in Real-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 

4.30 High-gain Parameter in Real-time . . . . . . . . . . . . . . . . . . . . . . . . 91 

5.1 A Subdivision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 

C.1 Colored Noise Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 

xxvi


List of Tables 

4.1 Set of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

4.2 Parameters for Real-time Experiment . . . . . . . . . . . . . . . . . . . . . . 89 

C.1 Arguments of the DC Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 155 

C.2 Arguments of the Main Function . . . . . . . . . . . . . . . . . . . . . . . . . 157 

C.3 Arguments of the Innovation Function . . . . . . . . . . . . . . . . . . . . . . 160 

xxvii


xxviii 

LIST OF TABLES


Nomenclature 

χ a compact subset of the state space 

δt 

sampling time 

˙x(t) the time derivative of the time variable x(t): dx 

dt (t) 

F adaptation function 

Id(t) innovation for a window of length d, at time t (continuous time framework) 

Id,k 

innovation for a window of length dδt, at epoch k (discrete time framework) 

θ, θ(t), θk high-gain parameter 

A ′ 

transpose of the matrix A 

diag(v) a dim(v) × dim(v) matrix, such that (diag(v))i,j = δijvi 

L 1 b (Uadm) the set of integrable (thus measurable), bounded functions having their values 

in Uadm. 

n dimension of the state space 

nu 

ny 

P (t) S −1 (t) 

dimension of the input space 

dimension of the output space 

S(t) Riccati matrix 

t time variable 

u(t) input space 

X a n-dimensional analytic differentiable manifold 

x(t) state space 

y(t) output space 

z(t) estimated state (continuous time framework) 

xxix


zk 

estimated state (discrete time framework) 

AEKF adaptive high-gain extended Kalman filter 

EKF extended Kalman filter 

HGEKF high-gain extended Kalman filter 

MISO multiple input multiple output 

MISO multiple input single output 

SISO single input single output 

w.r.t. with respect to 

xxx 

LIST OF TABLES


Chapter 1 

Introduction 

Juggling. Balls fly in thin air, tracing parabolas. The juggler sees, and reacts. Eyes move, 

muscles follow, and the improbable act of using only two hands to keep three, four, five or 

even more balls in the air continues, non-stop. 

It goes without saying that the juggler has both eyes open. Why, though? With one eye 

closed, the juggling becomes harder, as depth perception becomes di?cult and field of view 

more limited. Juggling with one eye closed can be done, but only at the cost of training and 

perseverance. 

What about juggling blindfolded? Only the best jugglers can do this by inferring the 

position of each ball from their hands and the accuracy of their throws. This is because they 

know what the parabola should look like. 

When we rely on partial measurements to reconstruct the state of a physical process, 

we are like blindfolded jugglers. We only have limited information to tell us the di?erence 

between what we think the state is and what it actually is. In process control, the tool 

dedicated to the state reconstruction task is known as an observer. 

An observer is a mathematical algorithm that estimates the state of a system. What 

makes the observer special is that it does not guess. It infers, based on a prediction of what 

it expects to measure and a correction driven by the difference between the predicated and 

the measured. 

The concept of observer is introduced in this chapter with only little mathematical formalism. 

We provide an idea of the ma jor issues of the field and put forward the motivations 

underlying the present work. 

1


1.1 Context 

1.1 Context 

We can model any physical process by considering it as a system. In our framework, a 

set of relevant variables describes the evolution of the state of the system with time. These 

are called the state variables. The system interacts with the outside world in three different 

ways: 

− input, or control, variables are quantities that have an effect on the system behavior 

and that can be set externally, they are denoted u(t), 

− output variables, or measurements, are quantities that are monitored, generally they 

are a subset or a transformation of the state variables, we denote them by y(t), 

− perturbations are variables that have an effect on the system behavior and that cannot 

be controlled; most of the time they cannot be measured 1 . 

The state variables are represented by a multidimensional vector, denoted x(t). The evolution 

of x(t) with time is accounted for by an ordinary differential equation 2 : 

dx(t) 

dt 

= f(x(t),u(t),t). 

The relation between state and output variables, i.e. the measurement step, is described by 

an application: 

y(t) =h(x(t),u(t),t). 

A system is therefore defined as a set of two equations of the form: 

� dx(t) 

dt = f(x(t),u(t),t) 

y(t) = h(x(t),u(t),t) 

The state estimate rendered by the observer can be used for monitoring purposes, by 

a control algorithm as schematized in Figure 1.1, or processed off-line (e.g. in prototyping 

assessment as in [21], section 3.6). 

For example, in the juggling experiment, the state variables are the 3D position and speed 

of the balls. The input variables are the impulses the hands apply to the balls. 

With eyes wide open, the output variables are the balls position. In the one eyed juggling 

experiment the output variables are an imperfect knowledge of the balls position, e.g. because 

of the hindered depth perception. In the blind juggling experiment, tactile signals are the 

output variable. 

In all these three cases the model is the same: the one of a falling apple. 

1 From an observer’s point of view, measured perturbations and control are both inputs. The controller’s 

point of view is different since it uses the controls to stabilize the system and reject perturbations. 

2 We do not consider either partial differential equations or algebraic differential equations. 

2


+ 

Controller 

Set 

Points 

u(t) 

System 

x(t) 

y(t) 

Figure 1.1: A control loop. 

The study of observers is divided into three subproblems: 

Observer 

z(t) 

Estimated 

state 

1.1 Context 

1. the observability problem is the study of a given mathematical model in order to determine 

to which extent it can be useful to estimate state variables, 

2. the convergence problem focuses on the observer algorithm, the diminution of the error 

between the real and estimated state is studied and the convergence to zero assessed 

(see Figure 1.2), 

3. the loop closure problem addresses the stability of closed control loops, when the state 

estimated by an observer is used by a controller 3 (see Figure 1.1). 

The juggler solves the two first problems when he is capable of catching the balls having 

one or both eyes closed. He solves the third problem when he goes on juggling. 

For systems that are described by linear equations, the situation is theoretically well 

known and answers to all those problems have already been provided (see for example [74]). 

We therefore concentrate on the nonlinear case. 

There is a large quantity of observers available for nonlinear systems, derived from practical 

and/or theoretical considerations: 

− extended observers are adaptations of classic linear observers to the nonlinear case (e.g. 

[58]), 

− adaptive observers estimate both the state of the system and some of the model parameters; 

a linear model can be turned into one that is nonlinear in this configuration 

(e.g [98]), 

− moving horizon observers see the estimation procedure as an optimization problem (e.g. 

[12]), 

3 A controller calculates input values that stabilize the physical process around a user defined set point or 

trajectory. The model used in the observer and the possible model used in the controller may not be the same. 

3


1.2 Motivations 

− interval observers address models with uncertain parameters. Unlike adaptive observers 

they do not estimate the model parameters. They use the fact that any parameter p 

lives in an intervals of the form [pmin,pmax] (e.g. [91, 105]). 

The present work focuses on high-gain observers, which are a refinement of extended 

observers, allowing us to prove that the estimation error converges to zero in a global sense. 

The design of such observers relies on a general theory of observability for nonlinear systems 

[57]. It is proven that the convergence is exponential. This means that the estimation error 

is bounded on the upper limit by a decreasing exponential of time, whose rate of convergence 

can be set by the user. The loop closure problem is studied in [57], Chapter 7, and a weak 

separation principle is stated in [113] (see also [21]). 

Our prototype observer is the Kalman filter. In order to further develop our motivations, 

we provide a review of high-gain observer algorithms in the next section. A complete discusion 

is the object of Section 2.5. 

A non measured state variable 

&!! 

%"! 

%!! 

$"! 

$!! 

#"! 

#!! 

"! 

Initial guess 

'()* +,-.)* 

/+0,1)0,2. 3,) ). (40(.5(5 6)*1). 7,*0(8 

/+0,1)0,2. 3,) ) 9,-9!-),. (40(.5(5 6)*1). 7,*0(8 

! 

! " #! #" $! $" 

Time 

Figure 1.2: Estimation of a non measured variable: 

The initial estimated state is wrong. With time, the observer reduces the estimation error. 


The extended Kalman filter is a practical answer to the observation problem for nonlinear 

systems. Although global convergence is not theoretically proven, this observer works well in 

practice. High-gain observers are based on extended observers, and result from the grouping 

of two main ingredients: 

1. the use of a special representation of the nonlinear system, provided by the observability 

theory (see Chapter 2) and, 

4


2. the use of a variant of extended observers. 


A story is best told when started from the beginning. Thus we begin by introducing the 

Kalman filter in the linear case. A linear system is given by: 

� dx(t) 

dt = Ax(t)+Bu(t), 

y(t) = Cy(t), 

where A, B and C are matrices (having the appropriate dimensions) that may or may not 

depend on time. The two archetypal observers for such a system were proposed in the 1960’s 

by D. G. Luenberger [86], R. E. Kalman. and B. S. Bucy [75, 76]. They are known as the 

Luenberger observer and the Kalman-Bucy filter respectively. 

The leading mechanism in those two algorithms is the prediction correction scheme. The 

new estimated state is obtained by means of: 

− a prediction based on the model and the old estimated state, and 

− the correction of the obtained values by the measurement error weighted by a gain 

matrix. 

We denote the estimated state by z(t). The corresponding equation for the estimated state 

is: 

dz(t) 

= Az(t)+Bu(t) − K (Cz(t) − y(t)) . 

dt 

In the Luenberger observer, the matrix K is computed once and for all. The real part of 

all the eigenvalues of (A − KC) have to be strictly negative. 

In the Kalman filter, the gain matrix is defined as K(t) =S−1 (t)C ′ 

R−1 where S is the 

solution of the differential equation: 

d 

S(t) =−A′ S(t) − S(t)A − S(t)QS(t)+C 

dt ′ 

R −1 C. 

This equation is a Riccati equation of matrices and is referred to as the Riccati equation. 

The matrices A, B and C are expected to be time dependent (i.e. A(t), B(t) and C(t)). 

Otherwise, the Kalman filter is equivalent to the Luenberger observer. 

The Kalman filter is the solution to an optimization problem where the matrices Q and 

R play the role of weighting coefficients (details can be found in [75]). These matrices must 

be symmetric definite positive. According to this definition, the Kalman filter is an optimal 

solution to the observation problem. We will see below that the Kalman filter has a stochastic 

interpretation that gives sense to those Q and R matrices. 

Those two observers are well known algorithms, i.e. convergence can be proven. However, 

when the matrices A, B and C are time dependent, the Kalman filter has to be used since 

the gain matrix is constantly being updated. 

Observability in the linear case is characterized by a simple criterion. The loop closure 

problem has an elegant solution called the separation principle: controller and observer can 

be designed independently and the overall loop remains stable. Interesting and detailed exposes 

on the Kalman filter can be found in [43, 45, 58]. 

5



Not all processes can be represented by linear relationships. Let us consider nonlinear 

systems of the form: 

� dx(t) 

dt 

y(t) 

= 

= 

f(x(t),u(t),t) 

h(x(t),u(t),t) , 

where f and h are nonlinear applications. The prediction correction strategy is applied as in 

the linear case. Since the correction gain is linearly computed via matrices, a linearization of 

the system is used. 

This mathematical operation has to be done at some point in the state space. Since the 

real state of the system is unknown, it is difficult to pick such a point. The estimated state 

is, therefore, used. Consequently the correction gain matrix has to be constantly updated. 

The Kalman filter equations are considered 4 . Let us denote: 

− z, as the estimated state, 

− A, as the partial derivative of f with respect to x, at point (z, u, t) and, 

− C, as the partial derivative of h with respect to x, at point (z, u, t). 

This observer is then defined as: 

� 

dz(t) 

dt = f(z, u, t) − S−1C ′ 

d 

dt 

R−1 (Cz − y) 

S = −A′ S − SA − SQS + C ′ 

R−1C . 

This algorithm is called the extended Kalman filter. 

The extended Kalman filter is an algorithm widely used in engineering sciences [58]. It 

works well in practice, which explains its popularity. There is, however, a lack of theoretical 

justification concerning its effectiveness. Indeed, except for small initial errors, there is no 

analytical proof of convergence for this observer. This is due to the linearization, which is 

performed along the estimated trajectory. The resulting matrices used in the Riccati equation 

are not correct enough outside of the neighborhood of the real trajectory. The system is poorly 

approximated and convergence cannot be proved. 

A modification of this observer, the high-gain extended Kalman filter, solves the problem. 

Provided that the system is under a representation specific to the observability property 5 , the 

observer converges exponentially. In other words, the estimation error is upper bounded by 

an exponential of time. 

The main results of the observability theory for nonlinear systems are introduced in Chapter 

2. The definitions of several high-gain observers are also included there. 

Corruption of signals by noise is a major issue in engineering. Signals processed by an 

observer are derived from a sensor, which inherently implies the presence of noise. This 

influence appears both on the state and on the output variables. In the stochastic setting 

systems are then represented by equations of the form: 

� 

dX(t) = f(X(t),u(t),t)dt + Q 1 

2 dW (t) 

dY (t) = h(X(t),u(t),t)dt + R 1 

2 dV (t) 

4 

The Lunenberger observer can be used for nonlinear systems, but necessitates a special representation of 

the nonlinear system that will be introduced in the next chapter. 

5 

This observability property is related to the first problem, i.e. the observability problem. 

6


where: 

− X(0) is a random variable, 

− X(t) and Y (t) are random processes, 

1.3 Contributions 

− V (t) and W (t) are two independent Wiener processes, also independent from x(0) (refer 

to Appendix C.5). 

In this context, Q is the covariance matrix of the state noise, and R is the covariance matrix 

of the measurement noise. 

We consider again the linear case together with the two assumptions: 

1. x(0) is a gaussian random variable, and 

2. state and output noises are gaussian white noises. 

In this setting the Kalman filter is an estimator of the conditional expectation of the state, 

depending on the measurements available so far 6 . When the Q and R matrices of the observer 

defined above are set to the Q and R covariance matrices of the system, the noise is optimally 

filtered (refer to [43, 45]). 

In practice noise characteristics are not properly known. Therefore the matrices Q and 

R of the observer are regarded as tuning parameters. They are adjusted in simulation. 

In the nonlinear case the analysis in the stochastic setting consists of the computation of 

the conditional density of the random process X(t). A solution to this problem is given by an 

equation known as the Duncan-Mortensen-Zakaï equation [80, 108]. It is rather complicated 

and cannot be solved analytically. Numerically, we make use of several approximations. One 

of the methods is called particle filtering (e.g. see [21, 42] for details). When we obfuscate 

the stochastic part of the problem, we obtain an observer with an elegant formulation: the 

extended Kalman filter. Let us focus on its stochastic properties. 

As demonstrated in [101], the extended Kalman filter displays excellent noise rejection 

properties. However, proof of the convergence of the estimation error can be obtained only 

when the estimated state comes sufficiently close to the real state. 

On the other hand, the high-gain extended Kalman filter is proven to globally converge 

(in a sense explained in subsequent chapters). However, it behaves poorly from the noise 

rejection point of view. Indeed, it has the tendency to amplify noise, thus resulting in a 

useless reconstructed signal. This problem is illustrated 7 in Figure 1.3. 

1.3 Contributions 

This dissertation concentrates on the fusion between the extended Kalman filter and the 

high-gain extended Kalman filter by means of an adaptation strategy. Our purpose is to 

merge the advantages of both structures: 

6 

z(t) =E [x(t)/Ft], where Ft = σ (y(s),s∈ [0,t]), i.e. the σ−algebra generated by the output variables for 

t ∈ [0; t]. 

7 

These two curves are obtained with the same observers as in Figure 1.2, the values of the tuning parameter 

are unchanged. The level of noise added to the output signal is the same in the two simulations. 

7


&!! 

%"! 

%!! 

1.4 Dissertation Outline and Comments to the Reader 

$"! 

'()* +,-.)* 

/+0,1)0,2. 3,) ). (40(.5(5 6)*1). 7,*0(8 

$!! 

! "! #!! #"! 

&!! 

%"! 

%!! 

$"! 

'()* +,-.)* 

/+0,1)0,2. 3,) ) 9,-9!-),. (40(.5(5 6)*1). 7,*0(8 

$!! 

! "! 

Time 

#!! #"! 

Figure 1.3: Estimation with noise. 

− the noise smoothing properties of the extended Kalman filter when the estimated state 

is sufficiently close to the real state; 

− the global convergence property of the high-gain extended Kalman filter when the 

estimated state is far from the real state. 

We propose an adaptation strategy in order to navigate between high-gain and non highgain 

modes. The adaptation requires knowledge of a quantity that reflects the quality of the 

estimation. Such a quality measurement is proposed and its usage is justified by an important 

lemma, i.e. Lemma 33 in Chapter 3. The convergence of the resulting observer is analytically 

proven. Extensions to the initial result are proposed. 

Practical aspects of the implementation of the algorithm are also considered. The use of 

an adaptation strategy implies the introduction of several new parameters. Guidelines to the 

choice of those parameters are given and a methodology is proposed. 


The reminder of the manuscript is organized as follows: 

− Chapter 2 contains a summary of the theory of observability in the deterministic 

setting. Observability normal forms are introduced and high-gain observers defined. A 

review of nonlinear observers having adaptive strategies for their correction gain closes 

the chapter. 

− The adaptive high-gain extended Kalman filter is defined in Chapter 3. We define our 

strategy and justify it with an important lemma. The complete proof of convergence is 

8


detailed here. 


− Chapter 4 is dedicated to the implementation of the observer. An example is used 

to progress through the different steps. An experiment on a real process illustrates the 

behavior of the observer in a real-time environment. 

− Finally Chapter 5 includes extensions to the original result. 

Complementary material is included in the appendices. 

Readers Interested in a practical approach of the adaptive high-gain extended Kalman filter... 

...should read Chapter 4. 

...should take a look to Section 3.1 (definition of the MISO normal form), and to Definition 

30 (of the observer). If they are not familiar with high-gain observers, Definition 15 (high-gain 

EKF) may be of help. 

...will find the main theoretical results in Lemma 33 (lemma on innovation) and Theorem 

36 (convergence of the observer). 

Readers interested only in the definition of the observer and its proof of convergence... 

...can read directly Chapter 3. 

...shall read Chapter 2 in order to understand the motivations for the use of the observability 

form. 

Readers that want to explore a bit further than the initial continuous MISO normal form... 

...should go to Chapter 5. Some of the informations of this chapter are also useful for 

standard high-gain EKFs: dealing with multiple outputs systems, understanding the proof 

of the properties of the continuous discrete Riccati equation,... 

9


Chapter 2 

Observability and High-gain 

Observers 

Contents 

2.1 Systems and Notations . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.2 The Several Definitions of Observability . . . . . . . . . . . . . . . 12 

2.3 Observability Normal Forms . . . . . . . . . . . . . . . . . . . . . . 15 

2.3.1 First case: ny >nu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.3.2 Second case: ny ≤ nu . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

2.4 Single Output Normal Form . . . . . . . . . . . . . . . . . . . . . . 18 

2.5 High-gain Observers . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

2.6 On Adaptive High-gain Observers . . . . . . . . . . . . . . . . . . 21 

2.6.1 E. Bullinger and F. Allőgower . . . . . . . . . . . . . . . . . . . . . . 23 

2.6.2 L. Praly, P. Krishnamurthy and coworkers . . . . . . . . . . . . . . . 24 

2.6.3 Ahrens and Khalil . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 

2.6.4 Adaptive Kalman Filters . . . . . . . . . . . . . . . . . . . . . . . . 29 

2.6.5 Boutayeb, Darouach and coworkers . . . . . . . . . . . . . . . . . . . 32 

2.6.6 E. Busvelle and J-P. Gauthier . . . . . . . . . . . . . . . . . . . . . . 35 

10


2.1 Systems and Notations 

The work presented in this dissertation follows the framework of the theory of deterministic 

observation developed by J-P Gauthier, I. Kupka, and coworkers. This theory has 

a very long history the beginning of which can be found in the articles of R. Herman, A. 

J. Krener [65], and H. J. Sussmann [110]. In the book [57], which is in itself a summary of 

several papers [53–56, 70], J-P. Gauthier and I. Kupka exposed well established definitions 

for observability and several important subsequent theorems. 

An important result of the theory is that there exist representations of nonlinear systems 

that characterize observability (those are denoted by observability normal forms in the literature). 

The article [52], from J-P Gauthier and G. Bornard, often referenced, contains early 

results. 

The construction of high-gain observers, either of the Luenberger (J-P. Gauthier, H. 

Hammouri and S. Othman [54]) or of the Kalman (F. Deza, E. Busvelle et al. [47]) style, 

rests upon the theory that normal forms are crucial in order to establish the convergence 

of such observers. Here convergence means that the estimation error decreases to zero. As 

we will see in Chapter 3, and in Theorems 14, 16 and 29, the estimation error decays at an 

exponential rate. Those observers are also called exponential observers. 

In the present chapter, the main concepts of observability theory in the deterministic 

setting are presented together with more recent results such as those from E. Busvelle and 

J-P. Gauthier [38–40]. 

The main contribution of this work is the construction and the proof of convergence of a 

high-gain observer algorithm, whose high-gain parameter is time varying. Hence a review of 

such adaptive-gain observers is proposed: F. Allgőwer and E. Bullinger (1997), M. Boutayeb 

et al. (1999), E. Busvelle and J-P. Gauthier (2002), L. Praly et al.(2004 and 2009) and a 

recent paper from H. K. Khalil (2009). 

2.1 Systems and Notations 

A system in the state space representation is composed of two time dependent equations1 : 

⎧ 

⎨ 

(Σ) 

⎩ 

dx(t) 

dt 

y(t) 

x(0) 

= 

= 

= 

f(x(t),u(t),t) 

h(x(t),u(t),t) 

x0 

where 

− x(t) denotes the state of the system, belonging to R n , or more generally to a ndimensional 

analytic differentiable manifold X, 

− u(t) is the control (or input) variable with u(t) ∈ Uadm ⊂ R nu , 

− y(t) denotes the measurements (or outputs) and takes values in a subset of R ny , 

− f is a u-parameterized smooth nonlinear vector field, and 

− the observation mapping h : X × Uadm → R ny is considered to be smooth and possibly 

nonlinear. 

1 Later on, the time dependency will be omitted for notation simplicity. 

11


2.2 The Several Definitions of Observability 

The inputs are time functions defined on open intervals of the form [0,T[ (with the 

possibility that T =+∞). The functions u(.) are assumed to be measurable and bounded 

almost everywhere on any compact sub-interval of [0,T[. The corresponding function set is 

L ∞ (Uadm). 

The outputs are also functions of time defined on open intervals of the form [0,T(u)[. 

This notation takes into account that for a given input function, defined for a maximum time 

T , the system might become unstable (i.e. explode toward infinity). The explosion time is 

likely to be less than T . Therefore we have T (u) ≤ T . Output functions are also measurable 

and bounded almost everywhere on any compact sub-interval of [0,T(u)[. The corresponding 

function set is L ∞ (R ny ). 

The set of all systems of the form (Σ) is denoted S = {Σ =(f, h)}. The genericity (or 

non-genericity) property of observable systems is considered with respect to the set S. 

The topologies associated to those sets are 

− the C ∞ Whitney topology for the set S (see, e.g. [66]) 2 . Two important features of that 

topology are 

1. it is not metrizable and, 

2. it has the Baire property 3 , 

− either the topology of uniform convergence or the weak-∗ topology for the sets L ∞ (Uadm) 

and L ∞ (R ny ), 

− we will also use the topology of the euclidean norm when dealing with subspaces of R q , 

q = n, nu or ny. 


Observability is the notion that translates the property of a system to permit the reconstruction 

of the full state vector from the knowledge of the input and output variables. In 

other words: considering any input function u(.), can any two distinct initial states x 1 0 , x2 0 be 

distinguished from one another? 

Definition 1 

− The state-output mapping of the system (Σ) is the application: 

PXΣ,u : X → L ∞ (R ny ) 

x0 ↦→ y(.) 

− A system (Σ) is uniformly observable (or just observable) w.r.t. a class C of inputs 

if for each u(.) ∈ C, the associated state-ouput mapping is injective. 

2 A basic neighborhood of a system Σ = (f, h) in the C j Whitney topology is determined by a set of 

functions ɛ(z) > 0, and formed by the systems ˜ Σ =( ˜ f,˜g) ∈ S such that the derivatives, up to the order j, of 

(f − ˜ f, h − ˜ h), w.r.t all the variables, have their norm at point z =(x, u) less than ɛ(z). 

3 Baire property: a countable intersection of open dense subsets is dense. 

12



This definition appears as the most natural one 4 , but as injectivity is not a stable property 

observability is difficult to manipulate from a topological point of view 5 . In order to render 

the notion of observability more tractable, Definition 1 is modified by considering the first 

order approximation of the state-output mapping. Let us begin with the definition of the 

first order approximation of a system: 

Definition 2 

− The first (state) variation of (Σ) (or lift of (Σ) on TX) is given by: 

⎧ 

⎪⎨ 

(TXΣ) 

⎪⎩ 

dx(t) 

dt 

dξ(t) 

dt 

= f(x, u) 

= Dxf(x, 

ˆy = 

u)ξ 

dxh(x, u)ξ 

where 

– (x,ξ ) ∈ TX (or R n × R n ) is the state of (TXΣ), 

– dxh is the differential of h w.r.t. to x, and 

(2.1) 

– Dxf is the tangent mapping to f (represented by the Jacobian matrices of h and 

f w.r.t. x). 

− the state-output mapping of TXΣ is denoted PTXΣ,u. This mapping is in fact the 

differential of PXΣ,u w.r.t. x0 (i.e. its first order approximation, TPXΣ,u|x0 ). 

The second part of Definition 1 is adapted to this new state-output mapping in a very 

natural way. 

Definition 3 

The system (Σ) is said uniformly infinitesimally observable 6 w.r.t. a class C of 

inputs if for each u(.) ∈ C and each x0 ∈ X, all the (x0 parameterized) tangent mappings 

TPX Σ,u|x0 are injective. 

Since the state-output mapping considered in this definition is linear then the injectivity 

property has been topologically stabilized. Finally a third definition of observability has been 

proposed by using the notion of k−jets 7 . 

4 

An equivalent definition, based on the notion of indistinguishability can be found in [19] (Definitions 2 

and 3). 

� 5 3 

e.g. x ↦→ x � is injective, but for all ɛ > 0, � x ↦→ x 3 − ɛx � isn’t. 

6 ∞ 

Infinitesimal observability can also be considered “at a point (u, x) ∈ L × X”, or only “at a point 

u ∈ L ∞ ” 

7 The k-jets j k u of a smooth function u at the point t = 0 are defined as 

j k u = 

� 

u(0), ˙u(0), ..., u (k−1) � 

(0) . 

Then for a smooth function u and for each x0 ∈ X, the k-jets j k y = 

� 

y(0), ˙y(0), . . . , y (k−1) � 

(0) is well defined: 

this is the k-jets extension of the state-output mapping. 

See [9] for details. The book is though not so easy to find. R. Abraham’s webpage can be of help. 

13


Definition 4 


− (Σ) is said to be differentially observable of order k, if for all j k u the extension to 

k-jets mapping 

Φ Σ k : X → R kny 

is injective. 

x0 ↦→ j k y 

− (Σ) is said to be strongly differentially observable of order k if for all j k u, the 

extension to k-jets mapping 

is an injective immersion 8 . 

Φ Σ 

k,j k u : X → R kny 

x0 ↦→ j k y 

From Definition 4, strong differential observability implies differential observability. The 

first component of the application (ΦΣ k,jk ) corresponds to the state-ouput mapping (i.e. 

u 

Definition 1). When the control variable belongs to C ∞ , (strong) differential observability 

implies C ∞ observability. It may seem that little is gained by adding those extra definitions. 

In fact the notion of differential observability can be checked in a really practical way as it is 

explained at the end of this section. 

In order to prove differential observability we need the controls to be sufficiently differentiable 

so that differentiations can be performed. But since in practice the control variable is 

not likely to be differentiable, we are looking forward to L ∞ observability. We consider now 

the following theorem: 

Theorem 5 ([57], 4.4.6 page 56) 

For an analytic system (Σ) (i.e. when f and g are analytic, or C ω functions) the following 

properties are equivalent: 

1. (Σ) is observable for all C ω inputs, 

2. (Σ) is observable for all L ∞ inputs. 

This means that for an analytic system both strong differential observability and differential 

observability imply L ∞ observability (i.e. observability for the class of L ∞ (Uadm) 

inputs). 

Another consequence of the theory (as explained in [40, 57], Chapters 3.1 and 4.4) is that, 

for analytic systems, uniform infinitesimal observability implies observability of systems (Σ) 

restricted to small open subsets of X, the union of which is dense in X. 

As we will see with the theorems of next paragraph, although observability or uniform 

observability is not an easy property to establish in itself, a good method is to find a coordinate 

transformation that puts it under an observability form. 

Another method comes from the definition of differential observability: 

8 

Immersion: all the tangent mappings Tx0Φ Σ 

k,jk Σ 

u to the mapΦ k,jk u have full rank n at each point. 

14


− Consider that the output y(t) is known for all times t ≥ 0. 

2.3 Observability Normal Forms 

− Compute the successive time derivatives of the outputs until there exists a k>0 such 

that the state x(t) can be uniquely computed, for all times, from the equations of 

(˙y(t), ÿ(t), ...) where y(t),u(t) are known time varying parameters. 

− We have found a k-jets extension of (Σ) that is injective and therefore it is at least 

differentially observable. 

− Assuming that (Σ) is analytic, then it is L ∞ observable. 

An illustration of this method is given in Chapter 4, Section 4.1.2. The process under 

consideration is a series-connected DC machine (see also [21], part 3.5). 


The main significance of the theory is the existence of two distinct situations, which depend 

on the number of outputs with respect to the number of inputs. In one case observability 

is a generic property 9 , and it is not a generic property in the other case. These two specific 

situations are explained in the subsections below. 

2.3.1 First case: ny >nu 

The first case occurs when the number of outputs is greater than the number of inputs, 

i.e. ny >nu. The situation is defined by two theorems. The first states the genericity 

property of the set of observable systems in S; the second theorem introduces the (generic) 

observability normal form. 

Theorem 6 ([57], 4.2.2 and 4.2.4 page 40) 

1. The set of systems that are strongly differentially observable of order 2n +1 is residual 

in S. 

2. The set of analytic strongly differentially observable systems (of order 2n +1) that are 

moreover L ∞ -observable is dense in S. 

Theorem 7 ([40]) 

The following is a generic property on S. Set k =2n +1. For all sufficiently smooth u(.), 

denote j k u(t) = � u(t), ˙u(t), ..., u (k−1) (t) � . Choose an arbitrarily large, relatively compact 10 

open subset Γ of X. Consider also an arbitrary bound on the control and its first k derivatives 

(i.e. u, ˙u, .., u (k) ). Then the mappings 

Φ Σ 

k,j k u : X → R kny 

x(t) ↦→ � y(t), ˙y(t), ..., y (k−1) (t) � 

9 A subset is said to be generic if it contains a residual subset. A subset is said to be residual if it is a 

countable intersection of open dense subsets. 

10 A subset is said relatively compact if its closure is compact. 

15



are smooth injective immersions that map the trajectories of the system (Σ) (restricted to Γ) 

to the trajectories of the system: 

⎧ 

y = z1 

⎪⎨ 

z1 ˙ = z2 

. 

zk−1 ˙ = zk 

⎪⎩ 

� 

zk ˙ = φk z1,z2, ..., zk, u, ˙u, ..., u (k)� 

(2.2) 

The form of (2.2) is called a phase variable representation where all the zi components 

of the state vectors are of dimension ny. Since k =2n + 1 then the total dimension of the 

state space of the phase variable representation is (2n + 1)ny. 

Let v = � u(t), ˙u(t), . . . , u (k) (t) � denote the controls of the phase variable representation. 

Then since all the mappingsΦ Σ 

k,jk are smooth injective immersions restricted to the subset 

u 

Γ of Theorem 7, systems of the form (2.2) are observable, strongly differentially observable 

and uniformly infinitesimally observable. That is to say that the set of systems that can be 

embedded in a phase variable representation is contained into the set of observable systems. 

Therefore, in the case ny >nu, observable systems are generic in S and can generically 

be embedded into a phase variable representation. 

2.3.2 Second case: ny ≤ nu 

In the previous section we depicted observability as a generic property when there are 

more outputs than inputs (ny >nu). This is no longer the case when ny ≤ nu. We restrict 

the exposure in this instance to single output systems, i.e. ny = 1. We also restrict the 

exposure to analytic systems within the set S. The study of observability in this case is done 

with the help of a tool called the canonical flag of distributions. We define it below. 

As we will see in the theorems to come, the study of observability is done by establishing 

existence or non-existence of a canonical flag of distributions. 

Definition 8 

− Consider a system Σ =( f, h) ∈ S. We define the canonical flag of distributions 

D(u) as ⎧ 

⎪⎨ D(u) = 

⎪⎩ 

� D0 (u) ⊃ D1 (u) ⊃ ... ⊃ Dn−1 (u) � 

D0 (u) 

D 

= Ker(dxh) 

k+1 (u) = Dk (u) ∩ Ker(dxL k+1 

f h) 

(2.3) 

where Lf h is the Lie derivative of h with respect to the vector field f. Ker denotes the 

kernel of an application. The control u(t) being considered as fixed. 

− If the distributions D i (u) have constant rank n − i − 1, and are independent of u(.), 

then D(u) is denoted as a uniform canonical flag. 

The property 

((Σ) has a uniform canonical flag ) 

16



is highly non-generic (it has co-dimension ∞, see [57]). The non-genericity of observability 

is given by the two following theorems. 

Theorem 9 ([38], Pg 22) 

The system (Σ) has a uniform canonical flag if and only if for all x 0 ∈ X, there is a 

coordinate neighborhood of x 0 , (V x 0,x), such that in those coordinates the restriction of (Σ) 

to V x 0 can be written as 

where ∂h 

∂x1 

⎧ 

⎪⎨ 

⎪⎩ 

y = h(x1,u) 

˙x1 = f1(x1,x2,u) 

˙x2 = f2(x1,x2,x3,u) 

. 

˙xn−1 = fn−1(x1,x2, ..., xn,u) 

˙xn = fn(x1,x2, ..., xn,u) 

∂fi and ∂xi+1 , i =1, ..., n − 1, never equal zero on Vx0 × Uadm. 

(2.4) 

As was the case for the form (2.2) of the previous section, a system under the form (2.4) 

is infinitesimally observable, observable and differentially observable of order n. Therefore 

if a system (Σ) has a uniform canonical flag then when restricted to neighborhoods of the 

form Vx0 × Uadm 

an be 

summarized in t 

Existence of a 

uniform 

canonical flag 

Existence of a 

normal form 

(restricted to a 

neighborhood) 

Observability 

(i e uniform 

infinitesimal 

observability) 

Figure 2.1: Observability Equivalence Diagram. 

Infinitesimal observability still needs to be related to the normal form (2.4) in order to 

obtain a complete equivalence diagram. This is done with a second theorem. 

Theorem 10 ([38], Pg 24-25) 

If (Σ) is uniformly infinitesimally observable, then, on the complement of a sub-analytic 

subset of X of co-dimension 1, (Σ) has a uniform canonical flag. 

The combination of those two theorems closes the Diagram 2.1 which means that the 

normal form (2.4) characterizes uniform infinitesimal observability. 

In the control affine case, the situation above can be rewritten in a stronger way. We 

first recall that such a control affine system is 

⎧ 

p� 

⎪⎨ 

˙x = f(x)+ gi(x)ui 

(2.5) 

⎪⎩ 

i=1 

y = h(x). 

17


We then define the function 

for which we can establish Lemma 11 below. 

Φ : X → Rn � 

x ↦→ h(x),Lfh(x), ..., L n−1 

f h(x) 

� 

2.4 Single Output Normal Form 

Lemma 11 

(Σ) is observable ⇒ Φ is of maximum rank (i.e. n) on an open dense subset V of X. 

Let us choose any subset W ⊂ X such that the restriction of Φ to W is, as said in the 

lemma, a diffeomorphism. Then the theorems above are modified into: 

Theorem 12 ([38], Pg 26-27) 

Assume that (Σ) is observable. Then the restriction Φ|W maps (Σ) into a system of the 

form: 

⎧ 

⎪⎨ 

⎪⎩ 

y = x1 

˙x1 = x2 + 

˙x2 = x3 + 

. 

˙x2 = xn + 

p� 

i=1 

p� 

i=1 

p� 

i=1 

˙xn = ψ(x)+ 

g1,i(x1)ui 

g2,i(x1,x2)ui 

gn−1,i(x1,x2, ..., xn−1)ui 

p� 

i=1 

gn,i(x1,x2, ..., xn−1,xn)ui. 

(2.6) 

. 

Conversely, if a system is under the form (2.6), on an open subset Ω ⊂ Rn , then it is 

observable. 

In conclusion, we want to emphasize the representations (2.2), (2.4) and (2.6) as they 

translate observability. Their general form is ˙x = Ax + b(x, u) where A is an upper diagonal 

matrix and b(x, u) is a triangular vector field. As a consequence the proof of the convergence 

of high-gain observers is carried out considering such normal forms. In addition, as we will 

see with the proof of Chapter 3, both the structure of the matrix A and of the vector field 

b(x, u) will be very useful. 

A generalization of the normal form (2.6) to multiple output systems is used in Chapter 

5 in order to extend the definition of the observer to systems with more than one output 

variables. This form is distinct from the phase variable representation given in equation 

(2.2). 


The normal form introduced in this section is the one we use in order to define and prove 

the convergence of the observer for multiple-input, single-output (MISO) systems. We use 

18



the single-output assumption for simplicity and clarity of the exposure only. Up to a few 

modifications, the theorems can be proven in the multiple output case. Indeed, there is no 

unique normal form when ny > 1, therefore the definition of the observer has to be changed 

according to each specific case. In Chapter 5 a block wise generalization of the MISO normal 

form is considered, and the differences between the single output and the multiple output 

case are explained. 

As usual the system is represented by a set of two equations: 

− an ordinary differential equation that drives the evolution of the state, 

− an application that models the sensor measurements. 

Those two equations are of the form: 

� 

dx 

dt 

y 

= 

= 

A (u) x + b (x, u) 

C (u) x, 

where 

− x (t) ∈ X ⊂ R n , X compact, 

− y (t) ∈ R, 

− u(t) ∈ Uadm ⊂ R nu bounded. 

The matrices A (u) and C (u) are defined by: 

⎛ 

0 

⎜ 

A(u) = ⎜ 

. 

⎝ 

a2 (u) 

0 

0 

a3 (u) 

. .. 

· · · 

. .. 

. .. 

0 

⎞ 

0 

⎟ 

. ⎟ 

0 ⎟ 

an (u) ⎠ 

0 · · · 0 

C (u) = � a1 (u) 0 · · · 0 � 

(2.7) 

with 0


2.5 High-gain Observers 

2.5 High-gain Observers 

Early descriptions of high-gain observers can be found in multiple references including 

J-P. Gauthier et al., [54], F. Deza et al., [46, 47], A. Tornambé, [111], H. K. Khalil et al. and 

[50]. 

The first high-gain observer construction that we present in this section is the Luenberger 

style prototype algorithm of [54]. Afterwards we define the high-gain extended Kalman filter. 

Its structure is quite similar to that of the extended Kalman filter, which is well known and 

well used in engineering [58, 60]. In our case, it has been adapted to the observability normal 

form. 

High-gain observers are embedded with a structure based on a fixed scalar parameter 

(denoted θ) which allows us to prove that the estimation error decays to zero, exponentially 

at a rate that depends on the value of θ. When the high-gain parameter is taken equal to 1, 

high-gain observers reduce to their initial nonlinear version. 

Definition 13 

We suppose that all the ai(u) coefficients of the normal form (2.7) are equal to 1. The 

classical (or Luenberger) high-gain observer is defined by the equation 

dz 

dt = Az + b(z, u) − Kθ (Cz − y(t)) (2.8) 

where Kθ = ∆K with 11 ∆ = diag �� θ,θ 2 , ...,θ n�� and K is such that (A − KC) is Hurwicz 

stable. 

It is in fact not important that the ai(u) coefficients equal 1 or any other non zero constant 

since a change of coordinates brings us back to the situation where ai = 1. 

On the other hand, it is of the utmost importance that the coefficients do not depend on 

either u(t) or time: 

− the correction gain K is computed off-line (i.e. for u = u ∗ ), (A(u) − KC(u)) may 

not remain stable for u �= u ∗ and the convergence of the observer is not guaranteed 

anymore. 

− in full generality: (A(t) − K(t)C(t) stable ∀t >0), � (the system is stable). 

The convergence of the high-gain Luenberger observer is expressed in the following theorem. 

Theorem 14 

For any a>0, there is a large enough θ > 1 such that ∀(x0,z0) ∈ (χ × χ), we have 

for some polynomial k of degree n. 

�z(t) − x(t)� 2 ≤ k(a)e −at �z0 − x0� 

In Kalman style observers, the correction gain is computed at the same time as the 

estimated state, and therefore constantly updated. It is the solution of a Riccati equation 

11 Here, diag denotes the square matrix filled with zeros except for the diagonal that is composed of the 

adequate vector. 

20


2.6 On Adaptive High-gain Observers 

(of matrices). The corresponding matrix (denoted S or P ) is called the Riccati matrix. It is 

a symmetric and positive definite matrix 12 . Since S is a (n × n) symmetric square matrix, 

we only need to compute the upper or the lower part of the matrix (i.e. there are n(n+1) 

2 

equations to solve). 

Definition 15 

The high-gain extended Kalman filter is defined by the two equations below: 

� 

dz 

dt = A(u)z + b(z, u) − S−1C ′ 

R−1 (Cz − y) 

dS 

dt = −(A(u)+b∗ (z, u)) ′ 

S − S(A (u)+b∗ (z, u)) + C ′ 

R−1C − SQθS. 

(2.9) 

The matrices Q and R are originally the covariance matrices of the state and output 

noise respectively, and therefore are expected to be symmetric and positive definite. Since 

this observer is developed within the frame of the deterministic observation theory, those two 

matrices will be used as tuning parameters. Qθ is defined as Qθ = θ 2 ∆ −1 Q∆ −1 where θ > 1 

is a fixed parameter and 

⎛ 

1 0 · · · 0 

⎜ 

∆ = ⎜ 

0 

⎜ 

⎝ . 

1 

θ 

. .. 

. .. 

. .. 

. 

0 

0 · · · 0 1 

θn−1 ⎞ 

⎟ 

⎠ . 

In most cases, normal form representations are not used when implementing an extended 

Kalman filter (case θ = 1). The Jacobian matrices of f and h (computed with respect to the 

variable x) are used in the Riccati equation: 

dS 

dt 

� �′ � � � �′ 

∂f 

∂f ∂h 

= − 

S − S 

+ 

R 

∂x |x=z ∂x |x=z ∂x |x=z 

−1 

� � 

∂h 

− SQθS. 

∂x |x=z 

Refer to Chapter 1 for a more detailed explanation. 

Theorem 16 ([47, 57]) 

For θ large enough and for all T>0, the high-gain extended Kalman filter (2.9) satisfies 

the inequality below for all t> T 

θ : 

�z(t) − x(t)� 2 ≤ θ n−1 k(T )�z( T 

) − x(T 

θ θ )�2 T 

−(θω(t)−µ(T ))(t− 

e θ ) 

for some positive continuous functions k(T ), ω(T ) and µ(T ). 


In this section, we review a few adaptation strategies for the high-gain parameter that 

can be found in the literature. Strategies that are based on Luenberger like 13 observers are 

12 The fact that the solution of the Riccati equation of the observer (2.9) remains definite positive is not 

obvious and must be proven. Such a proof can be found in [57], Lemma 6.2.15 and Appendix B in the 

continuous discrete case. 

13 Here Luenberger like has to be understood in a broad sense. That is: observers with a correction gain 

matrix that is not computed online and, most of the time, computed following a pole placement-like scheme. 

21



presented in Subsections 2.6.1, 2.6.2 and 2.6.3. In the case of Kalman like observers, recall 

that the high-gain modification is used to provide a structure to the Q and R matrices. 

Therefore adaptation of the high-gain parameter can be understood as an adaptation of the 

covariance matrices of the state and measurement noise. This topic has been the object of 

quite extensive studies and a large bibliography on the subject is available, both in the linear 

and nonlinear cases. Subsection 2.6.4 provides references to such works while Subsections 

2.6.5 and 2.6.6 focuses on high-gain constructions specifically. 

The adaptation of the high-gain parameter is, as said in Chapter 1, motivated by the 

need to combine the antagonistic behaviors of an observer that filters noise efficiently and of 

an observer that is able to converge quickly when large perturbations or jumps in the state 

are detected. We want to emphasize the usefulness of this approach with three examples: 

1. In [64] S. Haugwitz and coworkers describe the model of a chemical reactor coupled 

with a highly efficient heat exchanger. The reactor is expected to be used to process 

a highly exothermic chemical reaction, and the temperature measured at specific spots 

constitutes the multidimensional output variable. Although the inlet concentrations 

are supposed to be known and fixed, it may happen that the apparatus meant to blend 

the reactants fails. The inlet concentration is then no longer the one expected, thus 

provoking a non-measured large perturbation. The authors use, quite successfully, an 

extended Kalman filter that takes into account this possibility of failure in the same 

spirit as when we estimate the load torque of the DC machine in Chapter 4. An adaptive 

high-gain extended Kalman filter can be really efficient for this kind of application by 

increasing the performance of the observer at perturbation times. 

2. In vehicle navigation, data from an inertial navigation system (INS) — a 3-axis accelerometer 

coupled with a gyroscope — and data from a global navigation satellite 

systems (GPS, GALILEO, GLONASS) are fusioned. The first type of sensors is very 

precise at time 0 but with an error domain that grows with time. Sensors of the second 

type have an error domain larger that the one of the INS at time 0 but that is stable. 

The purpose here is to know the position as precisely as possible with an error domain 

as small as possible. An observer that filters the measurement noise is needed but estimation 

error may increase with time because of sudden changes of direction, sudden 

changes in the topology of the road or the loss of the GPS signal because of tunnels 

or urban canyons. The estimation of the covariance matrices Q and R of Kalman like 

filters is the subject of the articles [96, 115] (linear case) or [37] (nonlinear case). The 

book of M. S. Greywal [60] provides a solid introduction to this topic. 

3. In a refinery, changes of the processed crude oil are perturbations. Starting from the 

atmospheric column, the disturbance propagates along the refinery. The speed of propagation 

of the disturbance front depends on many parameters (the several processes 

have low time constants, crude oil can be retained,...), and is not accurately known. An 

EKF 14 like observer is useful when there are no such changes, and an adaptive high-gain 

observer would be of use in order to detect the feed change [38, 47, 113]. 

Finally we want to cite a few techniques used in order to render an observer’s gain adaptive 

that we have not considered: 

14 Extended Kalman filter. 

22


− statistical methods [115], 

− genetic algorithms based observers [97], 

− Neural networks based observers [109], 

− fuzzy logic approach [72]. 


Those observers are based on empirical methods. As a result, very little can be demonstrated 

or proven with respect to their convergence properties. 

2.6.1 E. Bullinger and F. Allőgower 

In a 1997 paper, E. Bullinger and F. Allgőwer proposed a high-gain observer having a 

varying high-gain parameter. Their observer is inspired by the structure proposed by A. 

Tornambè in [111]. It is a Luenberger like observer for a system of the form (2.7) except that 

αi(u) = 1 for all i ∈ 1, ..., n. Only the last component of the vector field b(x, u) is not equal to 

zero (see definition below). The control variable u may be of the form u = � u, ˙u, u (2) , . . . , u (n)� 

which is not one of the assumptions of system (2.7). As it appears below, the main difference 

between this observer and a classic high-gain is that the influence of the vector field to the 

model dynamic is neglected. 

Definition 17 

Consider a single input, single output system of the form 

Then define a high-gain obsever 

⎧ 

⎪⎨ 

⎪⎩ 

x1 ˙ = x2 

x2 ˙ = x3 

. 

xn−1 ˙ = xn 

xn ˙ = φ(x, u) 

y = x1 

(2.10) 

˙z = Az − ∆K(z1 − y) (2.11) 

with ∆K defined in the same manner as for the observer (2.8): 

− ∆ = diag �� θ,θ 2 , . . . ,θ n�� , and 

� � 

�� 

− A − K 1 0 . . . 0 is Hurwicz stable. 

Then: 

− select a strictly increasing sequence of elements of R: {1, θ1, θ2, ...}, 

− choose λ > 0, a small positive scalar, and 

23



− consider the adaptation function: 

� 

γ|y(t) − z1(t)| 

˙s = 

2 for |y(t) − z1(t)| > λ 

0 for |y(t) − z1(t)| ≤ λ. 

The parameter θ is adapted according to the rule: 

− when t =0set θ =1 

− when s(t) =θi, set θ = θi (or in other words θ = sup{θi ≤ s(t)}). 

(2.12) 

It is clear from the definition of the adaptation procedure that the parameter θ cannot 

decrease which makes this observer a solution to a tuning problem rather than the noise 

reduction problem. The convergence of the observer is established in the following theorem. 

This observer comes from a paper published in 1997 (i.e. [35]), we therefore checked 

for updates of the strategy in more recent paper. In [36], the authors address λ-tracking 

problems 15 . The observer they use is the one described in this section. 

Theorem 18 ([35]) 

Consider the system (2.10) above, together with the assumptions 

1. the system exhibits no finite escape time, and 

2. the nonlinearity φ(x, u) is bounded. 

Then for any λ > 0, γ > 0, β > 0 and any S0: 

the total length of time for which the observer output error is larger than λ is finite. 

That is to say: 

� 

∃Tmax < ∞ : dt < Tmax where T = {t|�y(t) − z1(t)� ≥λ}. 

T 

In another theorem of the same paper (i.e. Theorem 3 of [35]), an upper bound for the 

estimation error for high values of t is provided. But in the original article from which this 

observer is inspired, [111], the estimation error is not bounded by an exponentially decreasing 

function of the time. 

The main differences between these works and with the work proposed here is that firstly 

that our observer is a Kalman based observer which, implies taking into account the evolution 

of the Riccati matrix. The second distinction is that here we address the problem of noise 

reduction when the estimation is sufficiently good. 

2.6.2 L. Praly, P. Krishnamurthy and coworkers 

The second observer with a dynamically sized high-gain parameter proposed that we will 

expose in this review is one from L. Praly, K. Krishnamurthy and coworkers. Descriptions of 

15 As explained in the article, λ-tracking is used for processes for which we know for certain that asymptotic 

stabilization can be achieved only by approximation. The objective is therefore modified into one of stabilizing 

the state within a sphere of radius λ centered on the set point. 

24



their observer construction can be found in articles like [16, 78, 79, 103] 16 . In the following 

we present the observer defined in [16], as it complies with the latest updates of their theory. 

Definition 19 

The system dynamics are: 

⎧ ⎛ 

⎞ ⎛ 

0 a2(y) 0 ... 0 

⎜ 0 0 a3(y) ... 0 

⎟ ⎜ 

⎪⎨ ⎜ 

˙x = ⎜ 

. 

⎟ ⎜ 

⎜ . 

.. . 

⎟ ⎜ 

⎟ x + ⎜ 

⎜ 

⎟ ⎜ 

⎝ 0 ... ... ... an(y) ⎠ ⎝ 

⎪⎩ 

0 ... ... ... 0 

y = x1 + δy(t). 

It is also denoted � 

where 

− y is the measured output, 

− the functions ai are locally Lipschitz, 

f1(u, y) 

f2(u, y, x2) 

f3(u, y, x2,x3) 

. . . 

fn(u, y, x2, ..., xn) 

˙x = A(y)x + b(y, x2, .., xn, , u)+∆( t) 

y = x1 + δy(t) 

⎞ 

⎛ 

⎟ ⎜ 

⎟ ⎜ 

⎟ ⎜ 

⎟ + ⎜ 

⎟ ⎜ 

⎠ ⎝ 

δ1(t) 

δ2(t) 

δ3(t) 

. . . 

δn(t) 

⎞ 

⎟ 

⎠ 

(2.13) 

(2.14) 

− u is a vector in R nu representing the known inputs and a finite number of their derivatives, 

− the vector ∆(t) represents the unknown inputs, and 

− δy is the measurement noise. 

The construction of the observer is based on some additional knowledge concerning the 

system: 

1. a function α of the output y such that 0 < ρ



Those conditions, imposed on the vector fields fi, appear to be very technical, but are in 

fact necessary in order to prove the convergence of the observer. Moreover, the importance 

of the functionΓ( u, y) has not to be underestimated as it is of the utmost importance in the 

definition of the update law of the high-gain parameter. 

Definition 20 

The high-gain observer with updated gain is defined by the set of equations: 

� 

˙z = A(y)x + b(y, z2, .., zn, , u)+θLA(y)K � z1−y 

θb � 

� 

� 

˙θ = θ ϕ1(ϕ2 − θ)+ϕ3Γ(u, y) 1 + �n j=2 |ˆx j| vj 

�� 

(2.15) 

where L = diag(L b ,L b+1 , ..., L b+n−1 ). 

Theorem 21 

Consider the system (2.13) and the associated observer (2.15). It is is then possible to 

choose the parameters ϕ1, ϕ2, ϕ3 (with ϕ2, ϕ3 high enough) such that for any L(0) ≥ ϕ2 the 

estimation error e(t) =z(t) − x(t) is bounded as follows: 

|L −1 e(t)| ≤ β1(L(0) −1 e(0),t)+sup s∈[0,t[γ1 

�� 

for all t ∈ [0,Tu[. Moreover L statisfies the relation: 

⎛�⎛ 

� 

�� 

� 

e(0) 

⎜�⎜ 

⎜�⎜ 

L(t) ≤ 4ϕ2 + β2 

,t + sups∈[0,t[γ2 ⎜�⎜ 

L(0) 

⎝�⎝ 

� 

� 

δ(s) 

ϕ2 

α(y(s))δy(s) 

δ(s) 

ϕ2 

α(y(s))δy(s) 

Γ(u(s),y(s)) 

x(s) 

where β1 and β2 are KL functions, and γ1, γ2 are functions of class K. 

�� 

⎞�⎞ 

� 

� 

⎟�⎟ 

⎟�⎟ 

⎟�⎟ 

⎠�⎠ 

� 

� 

This work uses a special form of the phase variable representation (2.2) of J-P. Gauthier 

et al. and therefore applies to observable systems in the sense of Section (2.2) above. The 

observer (2.15) has roughly the same structure as a Luenberger high-gain observer except 

for the fact that the correction gain is given as a function of the output error. The update 

function is determined by a function that bounds the incremental rate of the vector field 

b(y, x, u)(this is the part written in bold in (2.15)). 

The strategy adopted here isn’t based on a global Lipschitz vector field, which implies 

the off-line search and tuning of the upper bound (or the value) of the high-gain parameter 

θ. Instead, the observer tunes itself as a consequence of the adaptation function. However 

the function that drives the adaptation may not be that easy to find. 

The idea is quite similar to that of E. Bullinger and F. Allgőwer [35], with the difference 

being that in their case the adaptation is driven by the output error and in the case of L. 

Praly et al. the adaptation is model dependent. 

Theorem 21 states 17 that the observer (2.15) together with its adaptation function gives, 

at least for bounded solutions, an estimation error converging to a ball centered at the origin 

with a radius that depends on the asymptotic L ∞ -norm of the disturbances δ and δy. This 

17 The precise and complete theorem appear in the article [16], Theorem 1. 

26



consequently means that provided the disturbances vanish, the estimation error converges to 

0. This observer is not based on any quality metric of the estimation and the evolution of 

the high-gain doesn’t depend of the quality convergence. Therefore the situation may arise 

such that the observer has already converged and that the high-gain is still high, which would 

therefore amplify the noise. 

The work herein, contrary to this section’s observer, is set in the global Lipschitz setting. 

Further, it aims to provide an observer for which the high-gain parameter decreases to 1 (or 

the the lowest value allowed by the user) when the local convergence 18 of the algorithm can 

be used (i.e. the high-gain is not needed anymore). 

2.6.3 Ahrens and Khalil 

This paper deals with a closed loop control strategy that comprises an observer having a 

high-gain switching scheme. We focus on the observer’s definition together with the switching 

strategy used, and only give a simplified version of the system the authors consider, refer to 

[11] for details. 

Definition 22 

The simplified version of the system used in [11] is 

� 

˙x = Ax + Bφ(x, d, u) 

y = Cx + v 

where 

− x ∈ R n is the state variable, 

− y ∈ R is the output, 

− d(t) ∈ R p is a vector of exogenous signals, 

− v(t) ∈ R is the measurement noise, and 

− u(t) is the control variable. 

The matrices A, B and C are: 

⎛ 

0 

⎜ . 

⎜ 

A = ⎜ . 

⎜ 

⎝ 0 

1 

0 

. . . 

0 

1 

. .. 

. . . 

... 

. .. 

. .. 

0 

⎞ ⎛ ⎞ 

0 

0 

⎟ ⎜ ⎟ 

. 

⎟ ⎜ 

⎟ ⎜ . 

⎟ 

⎟ ⎜ ⎟ 

0 

⎟ B = ⎜ 

⎟ ⎜ . 

⎟ 

⎟ ⎜ ⎟ 

1 ⎠ ⎝ 0 ⎠ 

0 . . . . . . . . . 0 

1 

� 

� 

C = 1 0 ... ... 0 . 

The set of assumptions for this system is: 

(2.16) 

18 The convergence of the extended Kalman filter can be proven for small initial errors only (see the proof 

of Chapter 3). 

27



− d(t) is continuously differentiable, and takes its values in a compact subset of R p , 

− both d(t) and d 

dtd(t) are bounded, 

− v : R + ↦→ R is a measurable function of t and is bounded (i.e. ∃µ >0, |v(t)| ≤ µ), and 

− φ is a locally Lipschitz function in x and u, uniformly in d, over the domain of interest. 

The Lipschitz constant is independent of d(t). 

The observer proposed for such a system comes from earlier works such as [50]. 

Definition 23 

Let us denote by z the estimated state. The observer is 

where for i =1, 2 

˙z = Az + Bφ(z, d, u) − Hi(Cz − y) 

H ′ 

� 

α1,i 

i = θi 

α2,i 

θ2 , ..., 

i 

αn,i 

θn i 

The θi’s are small positive parameters such that 0 < θ1 < θ2, and the αi’s are chosen in such 

a way that the roots of the polynomial: 

have negative real parts. 

� 

. 

s n + α1,is n−1 + α2,is n−2 + ... + αn,i =0 

This observer is defined with two values for the high-gain parameter: 

− θ1 corresponds to the fast state reconstruction mode, 

− θ2 makes the observer much more efficient w.r.t. noise filtering. 

The two sets of αi parameters can be chosen such that they are distinct from one another. 

The switching scheme between the two values of θ has two main restrictions: 

− the value of θ should change whenever an excessively large estimation error is detected, 

and 

− the value of θ should not change because of overshoots. During an overshoot, the 

situation may arise when the estimated trajectory crosses the real trajectory, but has 

not yet converged. Switching from θ1 to θ2, in this case, is not desirable. 

Definition 24 (Switching scheme) 

Let us define δ > 0 and Td > 0, two constant parameters and D =[−δ; δ]. The value of θ 

is changed whenever 19 (z1 − y1) exits or enters D. When an overshoot occurs, the estimation 

error may enter and exit quickly the domain D: convergence is not achieved yet. The large 

value of the high-gain parameter is still needed. Those situations are handled by the use of a 

delay timer. Priority is given to the high-gain mode (see Figure 2.2): 

19 Matrix C gives us (Cz − y) =(z1 − y1). 

28


− whenever |y1 − z1| > δ then θ = θ1, and the delay timer is reset, 


− when |y1 − z1| < δ, we don’t know wether it is an overshoot or not: θ = θ1 and the 

delay timer is started, 

− when |y1 − z1| < δ and the delay timer is equal to Td, estimation is satisfactory and 

θ = θ2. 

The authors consider a control strategy that stabilizes the system provided that the 

full state is known. They include a fixed high-gain observer and consider the closed loop 

system. They propose a set of assumptions such that the system-observer-controler ensemble 

is uniformly asymptotically stable 20 , (see [11], Theorem 1). The last step is the demonstration 

that stability remains when the high-gain of the observer is switched between two well defined 

values (see [11], Theorem 2 and example of Section 4). 

The Luemberger observer doesn’t have the same local properties as the extended Kalman 

filter, namely good filtering properties and analytically guaranteed convergence for small 

initial errors. We therefore expect a high-gain extended Kalman filter having a varying θ 

parameter to be more efficient with respect to the noise filtering issue. 

y z 

2.6.4 Adaptive Kalman Filters 

Overshoots=no high gain change 

Td 

Change in the high gain value 

Time 

Figure 2.2: Switching strategy to deal with peaking. 

We now consider the problem of the adaptation of the high-gain parameter of Kalman 

filters. Recall that the high-gain parameter is used to provide the Q and R matrices with a 

specific structure. Therefore, the adaptation of the high-gain parameter may be seen as the 

modification of those two matrices 21 . There is nonetheless a big difference when the high-gain 

structure is not considered: there is no proof of convergence of the extended Kalman filter 

when the estimated state doesn’t lie in a neighborhood of the real state. The situation could 

be even worse. Examples of systems for which the filter doesn’t converge can be found in 

[100, 101]. 

The adaptation problem, when viewed as the adaptation of the Q and R matrices, has 

been the object of quite a few publications both in linear and nonlinear cases. In the linear 

20 The Theorem demonstrates that all the trajectories are bounded (ultimately with time) and that the 

trajectory (under output feedback i.e. using the observer) is close to that of the state feedback (i.e. controller 

with full state knowledge). 

21 Recall that when the system is modeled using stochastic differential equations, Q and R represent the 

covariances of the state and output noise, respectively. 

29



part of the world, early references may be found in the book edited by A. Gelb[58], the two 

volumes book of P. S. Maybeck [89, 90], Chapter 10 of the second one in particular, the book 

from C. K. Chui and G. Chen [43]. Recent papers can be found in the INS/GPS community, 

such as [96, 115] or the book from M. S. Grewal [60]. The review article of R. K. Mehra 

[92] is warmly advised as an introduction to the topic. We do not expatiate on the subject 

since 1) most of the techniques developed in those papers are statistical methods, 2) the main 

bottleneck in the analysis is due to the linearization of the model in order to use the Riccati 

equation which is specific to the nonlinear case, 3) when switching based methods are used 

we have a better time explaining them directly in the nonlinear setting. 

In the non linear case a vast majority of strategies are proposed for discrete-time systems. 

We describe a subset of those strategies below. 

Definition 25 

A discrete time system is defined by a set of two equations of the form: 

� 

xk+1 = f(xk,uk) 

yk+1 = h(xk+1) 

(2.17) 

with the usual notation xk = x(kδt), δt > 0 being the sample time. At least one of the two 

functions f and h is nonlinear. 

The discrete extended Kalman filter associated with this system is given by the set of 

equations: 

where 

P rediction 

� 

⎧ 

⎪⎨ 

Correction 

⎪⎩ 

z − 

k+1 

P − 

k+1 

= f(zk,uk) 

′ 

= AkPkA k + Qk, 

zk+1 = z − 

k+1 + Lk+1(yk+1 − h(z − 

k+1 )) 

P + 

k+1 = (Id − Lk+1C)P − 

k+1 

Lk+1 = P − 

k+1 

C ′ 

(CP − 

k+1 

′ 

C + R) −1 , 

− zk denotes the estimated state, and P0 is a symmetric positive definite matrix, 

− A is the jacobian matrix of f, computed along the estimated trajectory, 

− C is the jacobian matrix of h, computed along the estimated trajectory. 

The first strategy we present was proposed by M. G. Pappas and J. E. Doss in their 

article [99] (1988). Although they do not consider the observability issue of the system, it is 

nonetheless an underlying concern of their work: 

− when the system is at steady state, the system is less observable and a slow observer 

is required to give an accurate estimate, noise smoothing being an additional derived 

benefit, 

30



− when the state of the system changes (different operating point, modification of the 

physical characteristics of the process,...), it becomes more observable. A faster observer 

can be considered and is needed to efficiently track the rapidly changing state variables. 

The implementation they propose consists of two observers in parallel. Changes in the state 

of the system are seen as faults. They are detected via a modified fault detection algorithm 

[62]. The original algorithm is decomposed into three steps. 

1. The fault input sequence is formed as 

wk = α1wk−1 +(zk − zk−1) 

where wk is an estimation of the direction of the parameter change. At steady state, 

(zk − zk−1) corresponds to the measurement noise and therefore its sign is expected to 

change frequently. Depending on the value of α1 so is the case of wk. 

2. The fault test sequence is 

sk = sign ((zk − zk−1) wk−1) . 

− sk negative indicates that the parameter estimate variation is changing direction: 

no fault occurs, 

− sk positive over many successive tests, indicates that the variation is constant and 

a fault is detected. 

3. The fault sequence is filtered with the equation: 

rk = α2rk−1 + (1 − α2)sk 

where α2 determines the speed of fault detection and the rate of false alarms 22 . 

This strategy is modified in the three following ways: 

1. in order to reduce the influence of small noise components in the signal, the number of 

significant figures used in the calculations of (zk − zk−1) is truncated, 

2. in order to cope with the situation when (zk − zk−1) = 0, the sequence s is modified as 

sk = sign(−10 8 +(zk − zk−1)wk−1). 

When the parameter estimations doesn’t change, the sign function remains negative 

and no fault is detected, 

3. the sequence r(t) is clamped at a minimum value of −0.5, preventing excessively large 

drifts at steady state. 

22 For high values of α2 we obtain a high pass filter like behavior, and conversely for α2 small. 

31



In this strategy, α2 is a very important parameter as it is used to decide if the sign is 

considered to be either constant or if it is constantly changing. 

The ideas behind the second strategy come directly from articles like the one of M. S. 

Mehra [92] or the book of P. S. Maybeck [90]. References for this method are [69] (sensor 

fusion in robotics), [83] (visual motion estimation via camera sensor), and [37, 102] (in flight 

orientation). This method aims at estimating Q, the process state noise covariance matrix. 

It is viewed as a measure of the uncertainty in the state dynamics between two consecutive 

updates of the observer. An observation of Q, denoted Q ∗ is given by the equation (see [37]): 

which can be rewritten 

Q ∗ = (zk+1 − z − 

k+1 )(zk+1 − z − 

k+1 )′ 

+ P − 

k+1 − Pk+1 − Qk , 

Q ∗ = (zk+1 − z − 

k+1 )(zk+1 − z − 

k+1 )′ 

= (zk+1 − z − 

k+1 )(zk+1 − z − 

k+1 )′ 

− Qk)) 

− (AkPkA ′ 

k − Qk)). 

− (Pk+1 − (P − 

k+1 

The new value for the matrix Q, denoted ˆ Qk+1 is obtained using a moving average (or low 

pass filter) process: 

ˆQk+1 = ˆ Qk + 1 

� 

Q ∗ − ˆ � 

Qk . 

LQ 

LQ is the size of the window that sets the number of updates being averaged. In this pro- 

cedure, LQ is a performance parameter that has to be tuned. The quantity (zk+1 − z − 

k+1 )= 

Kk(yk − hk(z − 

k )) plays a key role in this strategy. It is denoted as innovation, and contains 

specific information on the quality of the estimation. In our work, we use a modified definition 

of innovation, and prove that it is a quality measurement of the estimation error23 . 

A third strategy consists of designing of a set of nonlinear observers with different values 

for Q and R. They are used in parallel and the final estimated state is chosen among all the 

estimates available. A selection criteria has to be defined: minimization of innovation is the 

most straightforward method that can be used (see the observer of Subsection 2.6.6). (We 

refer the reader to the algorithm of K. J. Bradshaw, I. D. Reid and D. W. Murray [32], and 

references therein.) Every estimate is associated with a probability density computed with a 

maximum likelihood algorithm. The state estimate is then obtained as a combination of all 

the observers’ outputs weighted by their associated probability. 

Notice that for nonlinear systems, E [u(x)] is distinct from u(E [x]). An interesting feature 

of this strategy, is that the first quantity can be computed quite naturally. 

2.6.5 Boutayeb, Darouach and coworkers 

In their research M. Boutayeb, M. Darouach and coworkers proposed several types of 

observers, including extended Kalman filter based obsevers [30, 31], observers based on the 

differential mean value theorems [116, 117], H∞ filtering [13], or for systems facing bounded 

23 Cf. Lemma 33 of Chapter 3 

32



disturbances [18]. This section deals with the observer described in [31], and in particular 

with the adaptive scheme proposed in [30]. The analysis they propose is set in the discrete 

time setting. 

First of all, note that the approach we described in the first part of this Chapter, and 

the approach followed in the articles cited above are different, in the sense that the systems 

considered are not expected to display the same observability property. Indeed in the present 

case, the authors only need the system to be N-locally uniformly observable and do not 

perform any change of variables. This implies that the class of nonlinear systems for which 

the observer is proven to converge is bigger than the one considered in the present work (see 

the numerical examples displayed in [30], for example). This observer can be used for systems 

that cannot be put into a canonical observability form. The drawback to this approach then, 

is that the observer converges locally and asymptotically (i.e. the state error is not upper 

bounded by an exponential term). 

Definition 26 

1. We consider a discrete, nonlinear, system as in Definition 25 where xk ∈ R n , uk ∈ R nu 

and yk ∈ R ny . The maps f and h are assumed to be continuously differentiable with 

respect to the variable x. 

2. The observer is defined as: 

� 

zk+1/k = f(zk,uk) 

Pk+1/k = FkPkF ′ 

k + Qk 

where 


� 

z k+1/k+1 = z k+1/k − Kk+1(h(z k+1/k) − yk+1) 

P k+1/k+1 = (In − Kk+1Hk+1)P k+1/k 

Kk+1 = P k+1/kH ′ 

k+1 (Hk+1P k+1/kH ′ 

k+1 + Rk+1) −1 , 

∂f(x, uk) 

Fk = |x=zk 

∂x 

, 

∂h(x, uk) 

Hk = |x=z . k+1/k ∂x 

(2.18) 

This definition is that of a discrete extended Kalman filter. It is completed by a set of 

assumptions that appear in the statement of the convergence theorem. Since the extended 

Kalman filter is known to converge when the estimated state is very close to the real state, 

the following theorem increases the size of this region. 

Theorem 27 ([30]) 

We assume that: 

1. the system defined in equation (2.17) is N-locally uniformly rank observable, that is to 

33



say that there exists an integer N ≥ 1 such that 

rank ∂ 

⎛ 

h(x, uk) 

⎜ 

h(., uk) ◦ f(x, uk) 

⎜ 

∂x ⎜ 

⎝ 

. 

h(., uk+N−1) ◦ h(., uk+N−2) ◦ h(., uk) ◦ f(x, uk) 

⎞ 

⎟ 

⎠ 

|x=x k =n 

for all xk ∈ K, and N-tuple of controls (uk, . . . , uk+N−1) ∈ U (where K and U are two 

compact subsets of R n and (R nu ) N , respectively), 

2. Fk, Hk are uniformly bounded matrices, and F −1 

k 

Let us define: 

1. the time varying matrices α and β by: 

exists. 

(xk+1 − z k+1/k) = βkFk(xk − zk) 

αk+1ek+1 = Hk+1(xk+1 − z k+1/k), 

2. the weighting matrices Rk and Qk such that, there exists a parameter 0 < ζ< 1 such 

that 

� 

and that 24 

sup |(αk+1)i − 1| ≤ 

i=1,...,ny 

sup |(βk)j| ≤ 

j=1,...,n 

σ(Rk+1) 

σ(Hk+1Pk+1/kH ′ 

k+1 + Rk+1) 

� 

(1 − ζ)σ(FkPkF ′ 

k 

+ Qk) 

σ(F ′ 

k )σ(Pk)σ(Fk) 

Then, the observer (2.18) ensures local asymptotic convergence: 

lim 

k→∞ (xk − zk) = 0. 

In [30] the authors propose to choose Qk = γe ′ 

k ekIn + δIn where ek = h(z k/k−1) − yk is 

the innovation. γ is taken sufficiently large, and δ sufficiently small such that the inequalities 

of Assumption (4) are met for all values of ek. Special attention must be given to the fact 

that Qk should not be set to a high value when the innovation is small 25 . 

The adaptation strategy we propose in Chapter 3 is based on the same kind of strategy 

but uses the innovation over a sliding horizon as the quality measurement for the estimation. 

By doing this we can link the adaptive scheme to the proof of convergence of the observer, 

which is not done for the observer of this section. 

24 σ and σ denotes respectively the maximum and minimum singular values. 

25 The article of L. Z. Guo and Q. M. Zhu [61], propose a hybrid strategy based on this subsection’s observer. 

They use the structure proposed by M. Boutayeb et al together with a neural network approach. 

34 

� 1 

2 

. 

� 1 

2 

,


2.6.6 E. Busvelle and J-P. Gauthier 


The article [38] of E. Busvelle and J-P Gauthier propose an observer that is high-gain at 

time 0 and then decreases toward 1. In a nutshell, the observer evolves from a pure high-gain 

mode that ensures convergence to an extended Kalman filter configuration that efficiently 

smooths the noise. To achieve this, the high-gain parameter is allowed to decrease and the 

convergence is proven. This article is, in some sense, the starting point of the present Ph.D. 

work. 

Definition 28 

The high-gain and non high-gain extended Kalman filter, for a system as defined 

in Section 2.4, is given by the three equations: 

⎧ 

⎪⎨ 

dz 

dt = A(u)z + b(z, u) − S(t)−1C ′ 

R−1 (Cz − y(t)) 

dS 

dt = −(A(u)+b∗ (z, u)) ′ 

S − S(A(u)+b∗ (z, u)) + C ′ 

R−1C − SQθS (2.19) 

⎪⎩ 

dθ 

dt 

= λ(1 − θ) 

where Qθ = θ 2 ∆ −1 Q∆ −1 , ∆ = diag( � 1, 1 

θ 

, ..., ( 1 

θ )n−1� ), Q and R as in (2.9), and λ is a 

positive parameter. 

If θ (0) = 1, then θ (t) ≡ 1 and (2.19) is nothing else than a classical extended Kalman 

filter applied in a canonical form of coordinates. Therefore it may not converge, depending 

on the initial conditions of the system. 

If λ =0and θ (0) = θ0 are large, then θ (t) ≡ θ0 remains large and (2.19) is a high-gain 

extended Kalman filter as defined above in Equation (2.9). 

The idea in (2.19) is to set θ (0) to a sufficiently large value, and to set λ to a sufficiently 

small value such that the observer converges exponentially quickly at the beginning. The 

estimated state reaches the vicinity of the real trajectory before θ becomes too small and the 

local convergence of the extended Kalman filter guarantees that it will remain close to the 

state of the system. 

Theorem 29 

For any ε ∗ > 0, there exists λ0 such that for all 0 ≤ λ ≤ λ0, for all θ0 large enough, for 

all S0 ≥ c Id, for all χ ⊂ R n , χ a compact subset, for all ε0 = z0 − x0,with (z0,x0) ∈ χ 2 , with 

x0 ∈ χ the following estimation holds for all t ≥ 0: 

Moreover the short term estimate 

||ε(t)|| 2 ≤ ||ε (0) || 2 ε ∗ e −at . 

||ε(t)|| 2 ≤ ||ε(0)|| 2 θ(t) 2(n−1) e −(a1θ(T )−a2)t 

holds for all T>0 and for all 0 ≤ t ≤ T , for all θ0 sufficiently large. The scalars a1 and a2 

are positive constants. 

This theorem demonstrates that the observer converges for any initial error. Nevertheless 

it is clearly not a persistent observer since after some time it is more or less equivalent to 

an extended Kalman filter because θ (t) is close to one. In order to make it persistent, the 

authors propose to use several such observers, each of them being initialized at different times 

35



in such a way that at any moment at least one of the observers has θ large and at least one 

observer has θ close to 1. The state of the observer having the shortest output error 26 is then 

selected. As shown in [38], this observer performs well in our applications. It is successfully 

applied for identification purposes in [40]. Nevertheless this procedure is not very reasonable 

since: 

− Even if the overall construction gives a persistent observer, it is a time-dependant 

observer, 

− It can be time consuming since it requires at least five (empirical value) observers in 

parallel, 

− The parameter λ has to be chosen sufficiently small, which means that after a perturbation, 

we can not return as quickly as we would like to a classical extended Kalman 

filter, even if the observer performs well, 

− The choice of the criteria used to select the best prediction between our observers, is 

not theoretically justified. 

This second chapter, which focused on the theoretical framework that encompasses our 

work, ends with the analysis of this last observer. In this chapter, we also provided insight 

into several adaptation strategies. In the next chapter we introduce the adaptive high-gain 

extended Kalman filter and develop the full proof of convergence. 

26 The output error (Cz − y) is the equivalent of innovation for continuous time systems. 

36


Chapter 3 

Adaptive High-gain Observer: 

Definition and Convergence 

Contents 

3.1 Systems Under Consideration . . . . . . . . . . . . . . . . . . . . . 39 

3.2 Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

3.3 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

3.4 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

3.5 Preparation for the Proof . . . . . . . . . . . . . . . . . . . . . . . 44 

3.6 Boundedness of the Riccati Matrix . . . . . . . . . . . . . . . . . . 47 

3.7 Technical Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

3.8 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 52 

3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

37


Up to this point, we have already studied the system, i.e., the system is of the normal 

form, either naturally or after a change of variables. According to theory, in order to reconstruct 

the state of this system, we can use any of the exponentially converging observers of 

Chapter 2. However, we wont. 

In this chapter, we solve the convergence part of the observability problem. Our goal is 

to define an observer that combines the antagonistic behaviors of the extended Kalman filter 

(EKF) and the high-gain extended Kalman filter (HG-EKF). 

The EKF is extensively used, 1) because of its attractive filtering properties (as explained 

in articles such as [101]), and 2) because it actually performs well in practice. However, a 

proof of convergence for this algorithm is known only for small initial estimation errors (as 

it can be seen in [17, 38] or within the proof of the main theorem below). Additionally, from 

a practical point of view, the EKF handles large perturbations with difficulty, as has been 

observed in simulations and experiments. 

Contrarily, the HG-EKF possesses improved global properties [47]. It converges regardless 

of the initial guess and/or independently of large perturbations. On the other hand, it 

is rather sensitive with respect to noise. 

Recall that the high-gain structure uses a single parameter denoted θ (θ > 1), and referred 

to as the high-gain parameter. The HG-EKF does its global job if and only if θ is sufficiently 

large. When θ is set to 1, it is formally equivalent to the standard EKF. 

The idea here is to make the parameter θ adaptive. Thus, 

− when the estimated state is far from the real state, θ is made sufficiently large such 

that the observer converges for any initial guess, 

− when the estimation is sufficiently close to the real state we allow θ to decrease. Once 

this condition is satisfied, the local convergence of the extended Kalman filter is applicable 

and the noise is more efficiently smoothed. 

It is natural to perform the adaptation under the guise of a differential equation of the 

form 

˙θ = F(θ, I), (3.1) 

where I is some quantity reflecting the amplitude of the estimation error: the smaller I the 

smaller the error. 

We introduce a simple and natural concept of “innovation” for the quantity I. This innovation 

concept is different from the one that is usually used 1 . It allows us to reflect the 

estimation error more precisely. 

The convergence of this observer is established in the continuous time setting for multiple 

inputs, single output systems 2 . This choice is made for the sake of maintaining the simplicity 

of the exposure, because a few modifications have to be made in order to cope with multiple 

outputs systems. Such modifications are explained in Chapter 5. 

1 Most of the time innovation is defined as 

− I = y − h(z, u) for the continuous case, with the notations of Definition 15, 

− I = y − h(z − 

k ,uk) for the discrete case, with the notations of Definition 25. 

2 We have the generic case when nu > 1 and the non-generic case when nu = 1, C.f. Chapter 2. 

38


3.1 Systems Under Consideration 

3.1 Systems Under Consideration 

We consider multiple input, single output nonlinear systems as in Section 2.4. We properly 

define all the constants: � 

dx 

dt 

y 

= 

= 

A (u) x + b (x, u) 

C (u) x 

(3.2) 

where x (t) ∈ X ⊂ Rn , X compact, y (t) ∈ R, and u(t) ∈ Uadm ⊂ Rnu is bounded for all times. 

The matrices A (u) and C (u) are: 

⎛ 

0 

⎜ 

A(u) = ⎜ 

. 

⎝ 

a2 (u) 

0 

0 

a3 (u) 

. .. 

· · · 

. .. 

. .. 

0 

⎞ 

0 

⎟ 

. ⎟ 

0 ⎟ , 

⎟ 

an (u) ⎠ 

0 · · · 0 

C (u) = � a1 (u) 0 · · · 0 � , 

with3 0


Definition 30 

The adaptive high-gain extended Kalman filter is the system: 

⎧ 

⎪⎨ 

dz 

dt 

⎪⎩ 

= A(u)z + b(z, u) − S−1C ′ R −1 

θ (Cz − y(t)) 

dS 

dt = −(A(u)+b∗ (z, u)) ′ S − S(A (u)+b∗ (z, u)) + C ′ R −1 


dθ 

dt = F(θ, Id (t)). 

3.3 Innovation 

(3.3) 

The functions F and Id will be defined later, in Section 3.3 and Lemma 42. The function 

Id is called the innovation. The initial conditions are z(0) ∈ χ, S(0) is symmetric positive 

definite, and θ(0) = 1. 

The function F has only to satisfy certain requirements stated precisely in Lemma 42. 

Therefore, several different choices for an adaptation function are possible. 

Roughly speaking, F(θ, Id (t)) should be such that if the estimation z (t) is far from x (t) 

then θ (t) increases (high-gain mode). Contrarily, if z (t) is close to x (t), θ goes to 1 (Kalman 

filtering mode). As it is clear from the proof of Theorem 36, this observer makes sense only 

when θ(t) ≥ 1, for all t ≥ 0. This is therefore another requirement that F(θ, Id) has to meet. 

The achievement of this behavior requires that we evaluate the quality of the estimation. 

This is the object of the next section. 

Remark 31 

1. Readers familiar with high-gain observers may notice that the matrices Rθ and Qθ are 

not exactly the same as in earlier articles such as [38, 47, 57]. The definitions developed 

here can also be substituted into those previous works without consequence. 

2. The hypothesis θ(0) = 1 may appear a bit atypical as compared to the results in [38, 57] 

for instance. 

− For technical reasons, the result of Lemma 39 depends on the initial value of θ, 

and θ(0) = 1 has no impact on α and β. 

− Secondly, note that in the case of large perturbations, θ will increase. In these 

instances, the initial value of θ is of little importance as we show in Lemma 42 

that the adaptation function can be chosen in such a way that θ reaches any large 

value in an arbitrary small time. 

− Finally, in the ideal case of no initial error, θ doesn’t increase which saves us from 

useless noise sensitivity due to a large high-gain initial value. 


The innovation Id is a measurement of the quality of the estimation. It is different 4 from 

the standard concept of innovation, which is based on a linearization around the estimated 

4 The same definition of innovation is used for moving horizon observers where the estimated state is the 

solution of a minimization problem. The cost function used here represents the proximity to the real state. It 

is the innovation we use, ([12, 95]). 

In related publications, observability is defined by the inequality of Lemma 33. In our case, the inequality is 

a consequence of the observability theory. 

40


trajectory. 

Definition 32 

For a “forgetting horizon” d>0, the innovation is: 

� t 

Id (t) = 

t−d 


�y (t − d, x (t − d) , τ) − y (t − d, z (t − d) , τ)� 2 dτ (3.4) 

where y (t0,x0, τ) denotes the output of the system (3.2) at time τ with x (t0) =x0. 

Hence y (t − d, x (t − d) , τ) denotes y(τ), the output of the process. Notice that y (t − d, z (t − d) , τ) 

is not the output of the observer. 

For a good implementation, it is important to understand the significance of this definition. 

Figure 3.1 illustrates the situation at time t. Innovation is obtained as the square of the 

L 2 distance between the black (plain) and the red (dot and dashed) curves. They respectively 

represent the output of the system on the time interval [t−d, t], and the prediction performed 

with z(t − d) as initial state. 

y(t) 

z(0) 

x(0) 

0 

0 

t-d 

Output traj. 

Estimated output 

y(t-d,z(t-d),t) 

t 

time 

Figure 3.1: The Computation of Innovation. 

The importance of innovation in this construction is explained by the following lemma. 

As we will see in Section 3.8, this is the cornerstone of the proof. 

Lemma 33 

Let x0 1 ,x02 ∈ Rn , and u ∈ Uadm. Let us consider the outputs y � 0,x0 1 , ·� and y � 0,x0 2 , ·� 

of system (3.2) with initial conditions respectively x0 1 and x02 . Then the following property 

(called persistent observability) holds: 

∀d >0, ∃λ 0 d > 0 such that ∀u ∈ L1 b (Uadm) 

�x 0 

1 − x 0 

2� 2 ≤ 1 

λ 0 d 

� d 

�y � 0,x 0 1, τ � − y � 0,x 0 2, τ � � 2 dτ. (3.5) 

0 

41


Let us set x0 1 = z(t − d), and x02 = x(t − d) then Lemma 33 gives: 

�z(t − d) − x(t − d)� 2 ≤ 1 

λ0 � t 

�y (τ) − y (t − d, z (t − d) , τ)� 

d t−d 

2 dτ, 

or, equivalently, 


�z(t − d) − x(t − d)� 2 ≤ 1 

λ0 Id(t). 

d 

This is to say, that up to a multiplicative constant, innovation at time t upper bounds the 

estimation error at time t − d. 

Remark 34 

One could think that the adaptation scheme is likely to react with a delay time d when 

the estimation error is large. However, as is explained in Chapter 4, Remark 45, it may not 

always be the case in practice. 

Proof. 

Let x1 (t) =x x 0 1 ,u (t) and x2 (t) =x x 0 2 ,u (t) be the solutions of (3.2) with xi (0) = x 0 i , 

i =1, 2. For any a ∈ [0, 1] , 

b (ax2 + (1 − a) x1,u) 

� a ∂ 

= b (x1,u)+ 

0 ∂α b (α x2 + (1 − α) x1,u) dα 

� a ∂ 

= b (x1,u)+ 

∂x b (α x2 + (1 − α) x1,u) dα (x2 − x1) . 

Hence for a =1 

b (x2,u) − b (x1,u) = 

�� 1 ∂b 

0 ∂x (α x2 

= 

� 

+ (1 − α) x1,u) dα (x2 − x1) 

B (t)(x2 − x1) , 

where B (t) = (bi,j) (i,j)∈{1,..,n} is a lower triangular matrix since 

Set ε = x1 − x2, and consider the system: 

⎧ 

⎨ 

⎩ 

0 

b (x, u) =(b (x1,u) ,b(x1,x2,u) , . . . , b (x, u)) ′ 

. 

˙ε = A (u) x1 + b (x1,u) − A (u) x2 − b (x2,u) 

= [A (u)+B (t)] ε 

yɛ = C (u) ε = a1 (u) ε1. 

It is uniformly observable5 as a result of the structure of B (t). Let us consider Ψ( t), the 

resolvent of the system, and the Gramm observability matrix Gd: 

� d 

Gd = Ψ (v) ′ C ′ CΨ (v) dv. 

5 

See [57] for instance, or compute the observability matrix 

� 

φO = C ′ 

|(CA) ′ 

| . . . |(CA n ) ′�′ 

, 

and check the full rank condition for all inputs. 

0 

42



Since �B (t)� ≤Lb each bi,j(t) can be interpreted as a bounded element of L∞ [0,d] (R). We 

� 

identify L∞ [0,d] (R) 

� n(n+1) 

2 

to L∞ � 

[0,d] R n(n+1) � 

2 and consider the function: 

Λ : L ∞ [0,d] 

� 

R n(n+1) 

2 

� 

× L ∞ [0,d] (Rnu ) −→ R + 

(bi,j) (j≤i)∈{1,..,n},u↩ → λmin (Gd) 

where λmin (Gd) is the smallest eigenvalue of Gd. Let us endow L ∞ [0,d] 

� 

R n(n+1) 

2 

� 

× L ∞ [0,d] (Rnu ) 

with the weak-* topology6 and R has the topology induced by the uniform convergence. The 

weak-* topology on a bounded set implies uniform continuity of the resolvent, hence Λ is 

continuous7 . 

Since control variables are supposed to be bounded, 

Ω1 = 

� 

L ∞ [0,d] 

� 

R n(n+1) 

2 

� 

; �B� ≤Lb 


� 

Ω2 = u ∈ L ∞ [0,d] (Rn � 

);�u� ≤Mu 

are compact subsets. Therefore Λ (Ω1 × Ω2) is a compact subset of R which does not contain 

0 since the system is observable for any input. � Thus Gd is never singular. Moreover, for Mu 

sufficiently large, 

includes L∞ [0,d] (Uadm). 

� 

u ∈ L∞ [0,d] (Rn );�u� ≤Mu 

Hence, there exists λ0 d such that Gd ≥ λ0 d Id for any u and any matrix B(t) as above. 

Since 

y � 0,x 0 1, τ � − y � 0,x 0 2, τ � = CΨ (τ) x 0 1 − CΨ (τ) x 0 2, 

then � � 

�y 0 

0,x1, τ � − y � 0,x 0 2, τ �� 2 = � � 0 

CΨ (τ) x1 − CΨ (τ) x 0 � 

� 

2 

2 , 

and finally 

� d � 

�y 

0 

� 0,x 0 1, τ � − y � 0,x 0 2, τ �� 2 dτ = � x0 1 − x0 � ′ � 

2 Gd x0 1 − x0 � 

2 

≥ λ0 � 

�x d 

0 1 − x0 � 

� 

2 

2 . 

� 

(3.6) 

Remark 35 

As is clear from the proof, we could have used both linearizations along the trajectories x1 

and x2 in order to define I. However, that definition would lead to the same inequality. In 

addition our definition is more practical to implement. 

The solution of the Riccati equation can also not be used to obtain an information equivalent 

to innovation. Here, to compute our innovation, we make an exact prediction (without 

the correction term that could disturb the estimation). 

6 The definition of the weak-* topology is given in Appendix A. 

7 This property is explained in Appendix A. 

43


3.4 Main Result 

3.4 Main Result 

The exponential convergence of the adaptive high-gain extended Kalman filter is expressed 

in the theorem below. 

Theorem 36 

For any time T ∗ > 0 and any ε ∗ > 0, there exist 0 < d < T ∗ and a function F (θ, Id) such 

that, for all times t ≥ T ∗ and any initial state couple (x0,z0) ∈ χ 2 : 

�x (t) − z (t)� 2 ≤ ε ∗ e −a (t−T ∗ ) 

where a>0 is a constant (independent from ε ∗ ). 

This theorem can be expressed in two different ways: with or without a term �ɛ0� 2 in the 

upper bound. The bound of Theorem 36 was presented in [23] and [22] 8 . The expression we 

use here should be interpreted to mean that the square of the error can be made arbitrarily 

small in an arbitrary small time. For the sake of completeness, we develop the other inequality 

in Remark 44. 

The proof is a Lyapunov stability analysis which requires several preliminary computations 

and additional results. In order to facilitate comprehension, we divide the proof into 

several parts: 

1. the computation of several preliminary inequalities, in particular the expression of the 

Lyapunov function we want to study, 

2. the derivation of the properties of the Riccati matrix S, 

3. the statement of several intermediary lemmas, among which is the lemma that states 

the existence of eligible adaptive functions, and finally 

4. the articulation of the proof. 

We begin with the computation of some preliminary inequalities in Section 3.5. 

3.5 Preparation for the Proof 

Remember 9 that θ ≥ 1, for all t ≥ 0. 

We denote by z the time dependent state variable of the observer. 

The estimation error is ε = z − x. 

We consider the change of variables ˜x = ∆x, and 

− ˜z = ∆z, and ˜ε = ∆ε, 

− ˜ S = ∆ −1 S∆ −1 , 

8 The other bound is 

�x (t) − z (t)� 2 ≤�ε0� 2 ε ∗ e −αqm(t−τ) . 

9 This is one of the requirements F(θ, Id) have to meet. Existence of such a function is shown in Lemma 42. 

44


− ˜ b(., u) =∆b(∆ −1 ., u), 

− ˜ b ∗ (·,u)=∆b ∗ � ∆ −1 ·,u � ∆ −1 . 


Since C = � a1(u) 0 . . . 0 � , and ∆= diag �� 1, θ −1 , ...,θ −(n−1)�� then C∆ = C. We 

have the following identity for the A(u) and∆: 

⎛ 

⎞ ⎛ 

0 a2 (u) 0 · · · 0 1 0 0 · · · 0 

⎜ 

. 

⎜ 0 a3 (u) .. 

⎟ ⎜ 

. ⎟ ⎜ 1 

⎟ ⎜ 0 θ 0 . 

A(u)∆= ⎜ 

. 

⎜ 

. 

.. . .. ⎟ ⎜ 

0 ⎟ ⎜ 

1 

⎟ ⎜ 0 0 θ 

⎝ 

0 an (u) ⎠ ⎜ 

⎝ 

0 · · · 0 

2 

. .. . 

. 

. .. . .. 0 

1 

0 · · · · · · 0 θn−1 ⎞ 

⎟ 

⎠ 

⎛ a2(u) 

0 θ 0 · · · 0 

⎜ 

a3(u) 

⎜ 0 θ 

= ⎜ 

⎝ 

2 

. .. . 

. 

. .. . .. 0 

an(u) 

0 θn−1 ⎞ 

⎟ = 

⎟ 

⎠ 

0 · · · 0 

1 

θ ∆A(u). 

This relation leads to the set of equalities: 

(a) ∆A = θA∆, (b) A ′ 

∆ = θ∆A ′ 

, 

(c) A∆−1 = θ∆−1A, (d) ∆−1A ′ 

= θA ′ 

∆−1 . 

Because we want to express the time derivative of ˜ε we need to know the time derivative of 

∆, as θ is time dependent. We simply write 

d∆ 

dt = 

⎛ 

d(1) 

dt 0 · · · 0 

⎜ � � 

⎜ d 1 

⎜ 0 dt θ 

. 

⎜ 

⎝ 

. 

. 

.. 

� 

0 

d 1 

0 · · · 0 dt θn−1 ⎞ 

⎟ 

⎠ 

� 

= 

⎛ 

0 0 · · · 0 

⎜ 0 − 

⎜ 

⎝ 

˙ θ 

θ2 ⎞ 

⎟ 

. ⎟ 

. 

. .. ⎟ 

0 ⎠ , 

0 · · · 0 − (n−1) ˙ θ 

θ n 

which can be rewritten as a multiplication of matrices with the use of 

N = diag ({0, 1, 2, ..., n − 1}). We obtain the two identities10 : 

d 

F(θ,I) 

(a) dt (∆) =− θ N∆ (b) d 

dt 

(3.7) 

� ∆ −1 � = F(θ,I) 

θ N∆−1 . (3.8) 

The dynamics of the error are given by: 

� 

˙ε =˙z − ˙x = A (u) − S −1 C ′ 

R −1 

θ C 

� 

ε + b (z, u) − b (x, u) , 

and the error dynamics after the change of variables are: 

d˜ε 

dt 

d∆ = dt ε + ∆˙ε 

= − ˙ θ 

θ 

10 remember that ˙ θ = F(θ, I). 

N∆ε + ∆ 

� 

A (u) − S−1C ′ 

R −1 

θ C 

� 

ε + ∆( b (z, u) − b (x, u)) 

= − ˙ θ 

θ N ˜ε + θA (u)˜ε − θ∆S−1∆∆−1C ′ 

R−1C∆−1∆ε � 

= θ − F(θ,I) 

θ2 N ˜ε + A˜ε − ˜ S−1C ′ 

R−1C ˜ε + 1 

� �� 

˜b 

θ (˜z, u) − ˜b (˜x, u) . 

45 

+∆b � ∆ −1 ˜z, u � − ∆b � ∆ −1 ˜x, u � 

(3.9)


The Riccati equation turns into 

d ˜ S 

dt = ˙ θ 

θ N∆−1 S∆ −1 + ˙ θ 


θ ∆−1S∆−1N − ∆−1 (A + b∗ (z, u)) ′ 

S∆−1 −∆−1S (A + b∗ (z, u))∆ −1 + ∆−1C ′ 

= ˙ � 

θ 

θ N ˜ S + ˜ � 

SN − � A∆−1 + b∗ (z, u) ∆−1�′ ∆ ˜ S 

R −1 

θ C∆−1 

−∆ −1 SQθS∆ −1 

− ˜ S∆ � A∆−1 + b∗ (z, u) ∆−1� + C ′ 

� 

˙θ 

= θ θ2 � 

N ˜ S + ˜ � � 

SN − A ′ � 

S ˜ + SA ˜ + C ′ 

R−1C − ˜ SQ˜ S 

− ˜ S∆b∗ (z, u) ∆−1 − ∆−1b∗′ (z, u) ∆ ˜ � 

S 

� 

F(θ,I) 

= θ θ2 � 

N ˜ S + ˜ � � 

SN − A ′ � 

S ˜ + SA ˜ + C ′ 

R−1C − ˜ SQ˜ S 

� 

. 

The derivative of the Lyapunov function �ε ′ � S�ε is 

d˜ɛ ′ � 

S˜ɛ ˜ 

dt = θ − F(θ,I) 

= θ 

R −1 

θ C − ˜ S∆Qθ∆ ˜ S 

− 1 

θ ˜ S ˜ b ∗ (˜z, u) − 1 

θ ˜ b ∗′ 

(˜z, u) ˜ S 

θ2 N˜ɛ + A˜ɛ − ˜ S−1C ′ 

R−1C˜ɛ + 1 

� ��′ 

˜b 

θ (˜z, u) − ˜b (˜x, u) ˜S˜ɛ 

+θ˜ɛ ′ � F(θ,I) 

θ2 � 

N ˜ S + ˜ � � 

SN − A ′ � 

S ˜ + SA ˜ + C ′ 

R−1C − ˜ SQ˜ S 

− 1 

� 

˜S ˜ 

θ b∗ (˜z, u)+ ˜b ∗ ′ 

(˜z, u) ˜ �� 

S ˜ɛ 

+θ˜ɛ ′ � 

S˜ 

− F(θ,I) 

θ2 N˜ɛ + A˜ɛ − ˜ S−1C ′ 

R−1C˜ɛ + 1 

� �� 

˜b 

θ (˜z, u) − ˜b (˜x, u) 

� �� 

˜b (˜z, u) − ˜b (˜x, u) − ˜b ∗ (˜z, u)˜ɛ . 

� 

−˜ɛ ′ 

C ′ 

R −1 C˜ɛ − ˜ɛ ′ ˜ SQ ˜ S˜ɛ + 2 

θ ˜ɛ′ ˜ S 

We consider that Q ≥ qm Id, and since �ε ′ 

C ′ 

R −1 C�ε ≥ 0 

(3.10) 

(3.11) 

d 

� 

�ε 

dt 

′ � 

S�ε � ≤−θqm�ε ′ � � 

S� 2 ′ 

�ε +2�ε �S ˜b (�z, u) − ˜b (�x, u) − ˜∗ b (�z, u) �ε . (3.12) 

The theorem is proven using the inequality (3.12), which requires some knowledge of the 

properties of S. We can note that the equality (3.10) has a strong dependency on θ, which 

we must remove in order to derive properties for S. Indeed, for the moment we don’t know 

which values θ should reach during runtime. To determine these values, we introduce the 

time reparametrization dτ = θ (t) dt, or equivalently τ = � t 

0 θ (ν) dν. We denote: 

− ¯x (τ) = ˜x(t), ¯z (τ) = ˜z(t), and ε (τ) =�ε (t), and 

− ¯ θ (τ) =θ(t), ū (τ) =u(t), and S (τ) = � S (t). 

We obtain the time derivative with respect to the time scale, τ, using the simple calculation: 

Therefore 

d¯ɛ 

dτ 

d¯ε 

dτ 

= d˜ε (t) 

dt 

dt 

dτ 

= 1 

θ (t) 

d˜ε (t) 

. 

dt 

I) 

= −F(θ, 

¯θ 2 

N ¯ε + A¯ε − ¯ S −1 C ′ 

R −1 C ¯ε + 1 

� � 

˜b 

¯θ 

(¯z,ū) − ˜b (¯x, ū) 

46


such that 

d ¯ S 

dt = 

F(θ, I) 

¯θ 2 

� � � 

NS ¯ + SN ¯ − A ′ � 

S ¯ + SA ¯ 

3.6 Boundedness of the Riccati Matrix 

− 1 

¯θ 

+ C ′ 

R −1 C − ¯ SQ¯ S (3.13) 

� 

. 

� ¯S ˜ b ∗ (¯z,ū)+ ˜ b ′ ∗ (¯z,ū) ¯ S 

We complete the description of the observer in the τ time scale with the equation 

d¯ θ 

dτ = F(¯ θ, Ī) 

¯θ 

. 

This last set of equations is established in order to investigate the properties of the Riccati 

matrix, particularly the fact that it is bounded (Cf. Section 3.6). 

Remark 37 

The Lipstchitz constant of the vector field b(., u) is the same in the x(t), ˜x(t) and ¯x(τ) 

coordinates. It occurs quite naturally in the single output case but implies a special definition 

of the observer in the multiple output case. Consequently this fact is proven in Chapter 5, 

Lemma 51, for systems having multiple outputs. 


The Riccati matrix S(t) has some very important properties: 

− S(t) can be upper and lower bounded by matrices of the form c.Id, and 

− if S(0) is symmetric definite positive, then S(t) is also symmetric definite positive for 

all times. 

Those � properties are established in the book [57], with the difference being that the term 

� 

� 

� F(θ,I) 

� 

� 

� 

� doesn’t appear in the Riccati equation (Cf. Section 2.4 of the 6th chapter of the 

θ 2 

book), and that the result there is also only valid for times t ≥ T ∗ , for some T ∗ > 0. As we 

will see in the present section, when θ(0) is set to 1 we can say a little bit more. 

Lemma 38 ([57]) 

� 

� 

Let us consider the Riccati equation (3.13). If the functions ai (u (t)), ��b ∗ i,j (z,u) 

� � 

� � 

�, � 

� F(θ,I) 

θ 2 

� 

� 

� 

�are smaller than aM > 0, and if ai (u (t)) >am > 0 then for all τ0 > 0 there exist two constants 

0 < �α < � β (depending on τ0, aM, am) such that, for all τ ≥ τ0, the solution of the Riccati 

equation satisfies the inequality 

�α Id ≤ S (τ) ≤ � β Id. 

This first relation is extended to all times t>0 via a second lemma. 

Lemma 39 

We still consider the Riccati equation (3.13) with S (0) = S0 being a symmetric definite 

positive matrix taken in a compact subset of the form aId ≤ S0 ≤ bId, 0 < a and 

θ (0) = 1. Then there exist two constants 0 < α 0 (and therefore α Id ≤ � S (t) ≤ β Id for all t ≥ 0). 

47



Proof. 

We denote by |.|, the Frobenius norm of matrices: |A| = � T race(A ′ A). Recall that for 

two symmetric semi-positive matrices A, B such that 0 ≤ A ≤ B we have11 :0≤ |A| ≤| B|. 

Choose τ0 > 0 and apply Lemma 38 to obtain �α and � β such that �α Id ≤ S (τ) ≤ � β Id for 

all τ ≥ τ0. In order to extend the inequality for 0 ≤ τ ≤ τ0, we start from: 

� τ dS (v) 

S (τ) = S0 + 

0 dτ dv 

⎡ 

� � 

τ 

= S0 + ⎣− A(u)+ 

0 

�b ∗ (z,u) F(θ, I) 

− 

θ θ 2 N 

�′ 

S 

� 

−S A(u)+ �b ∗ (z,u) F(θ,I) 

− 

θ θ 2 � 

N + C ′ 

R−1 � 

C − SQS dv. 

As θ (0) = 1, then S (0) = S (0) = S0, which together with SQS > 0 (symmetric semipositive) 

leads to 

⎡ 

� � 

τ 

S (τ) ≤ S0 + ⎣− A(u)+ 

0 

�b ∗ (z,u) F(θ, I) 

− 

θ θ 2 N 

�′ 

S 


� 

� � τ � 

�S (τ) � ≤ |S0| + 2 

0 

with AM = sup (|A(u(τ))|) and 

[0;τ0] 

−S 

� 

A(u)+ � b ∗ (z,u) 

θ 

F(θ,I) 

− 

θ 2 � 

N + C ′ 

R−1 � 

C dv, 

� 

� 

AM + B + � 

F(θ, I) 

� 

θ 2 

� 

� 

� 

� |N| 

� 

��S � � 

� � 

+ �C ′ 

R −1 � 

� 

C� 

dv 

� 

� 

��b ∗ � 

� 

(z,u) � ≤ B. Then 

� 

�S � � 

� � 

≤ |S0| + �C ′ 

R −1 � � τ 

� 

C� 

τ0 + 2s � � 

�S� dv, 

with s = aM |N| + AM + B. Applying Gronwall’s lemma gives us for all 0 ≤ τ ≤ τ0, 

� 

�S � � � 

� � 

≤ |S0| + �C ′ 

R −1 � � 

� 

C� 

τ0 e 2sτ 

� � 

� 

≤ |S0| + �C ′ 

R −1 � � 

� 

C� 

τ0 e 2sτ0 

� 

≤ b √ � 

� 

n + �C ′ 

R −1 � � 

� 

C� 

τ0 e 2sτ0 

= β1. 

In the same manner we denote P = S −1 , and use the equation 

dP 

dt 

= P 

� 

A(u)+ � b ∗ (z,u) 

θ 

− 

F(θ, I) 

θ 2 N 

�′ � 

+ A(u)+ �b ∗ (z,u) 

− 

θ 

11 See Appendix B.1 for details of establishing this fact. 

48 

0 

F(θ, I) 

θ 2 N 

� 

P 

(3.14) 

− PC ′ 

R −1 CP + Q


to obtain, for all 0 ≤ τ ≤ τ0, and with �s >0: 

� 

�P � � ≤ (|P0| + |Q| τ0) e2�sτ0 1 = 

≤ � 1 

a 

√ n + |Q| τ0 

� α1 

2�sτ0 1 e = 

From the inequalities (3.14) and (3.15) we deduce that 12 for all 0 ≤ τ ≤ τ0 

We now define 

such that for all τ ≥ 0 

α1 Id ≤ S (τ) ≤ β1 Id. 

α1 . 

� 

α = min (a,α 1, �α) and β = max b,β 1, � � 

β 

α Id ≤ S (τ) ≤ β Id. 

This relation is therefore true also in the t time scale. 

3.7 Technical Lemmas 


(3.15) 

Three lemmas are proposed in the following section. The two first are purely technical and 

are used in the very last section of the present chapter. They are from [38]. Their respective 

proofs are reproduced in Appendix B.2. 

The third lemma concerns the adaptation function. It basically shows that the set of 

canditate adaptation functions for our adaptive high-gain observer is not empty. The proof 

is constructive: we display such a function. 

Lemma 40 ([38]) 

Let {x (t) > 0,t≥ 0} ⊂ R n be absolutely continuous, and satisfying: 

12 Trivially, we have 

However 

This is the reason why we need the relation: 

dx(t) 

dt ≤−k1x + k2x √ x, 

�S� ≤β1 ⇒ S ≤ β1Id. 

α1 ≤�S� � α1Id ≤ S. 

�P �≤ 1 

α1 

⇒�P �≤ 1 

and then we use the following matrix property (see Appendix B.1): 

in order to end up with 

α1 

(P ≥ Q>0) ⇒ � Q −1 ≥ P −1 > 0 � . 

α1Id ≤ S. 

49


for almost all t>0, for k1, k2 > 0. Then, if x (0) < k2 1 

4k2 , we have 

2 

x(t) ≤ 4 x (0) e −k1t . 


Lemma 41 ([38]) 

Consider �b (˜z) − �b (˜x) − �b ∗ (˜z)˜ε as in the inequality (3.12) (omitting to write u in ˜ � 

b) and 

� 

suppose θ ≥ 1. Then ��b (˜z) − �b (˜x) − �b ∗ � 

� 

(˜z)˜ε � ≤ Kθn−1 �˜ε� 2 , for some K>0. 

Lemma 42 (adaptation function) 

For any ∆T >0, there exists a positive constant M(∆T ) such that: 

− for any θ1 > 1,and 

− any γ1 > γ0 > 0, 

there is a function F (θ, I) such that the equation 

˙θ = F (θ, I (t)) , (3.16) 

for any initial value 1 ≤ θ (0) < 2θ1, and any measurable positive function I (t), has the 

properties: 

1. that there is a unique solution θ (t) defined for all t ≥ 0, and this solution satisfies 

1 ≤ θ (t) < 2θ1, 

� 

� 

2. � F(θ,I) 

� 

� 

� ≤ M, 

θ 2 

3. if I (t) ≥ γ1 for t ∈ [τ,τ + ∆T ] then θ (τ + ∆T ) ≥ θ1, 

4. while I (t) ≤ γ0, θ (t) decreases to 1. 

Remark 43 

The main property is that if I (t) ≥ γ1, θ (t) can reach any arbitrarily large θ1 in an 

arbitrary small time ∆T , and that this property can be achieved by a function satisfying 

F (θ, I) ≤ Mθ 2 with M independent from θ1 (but dependant from ∆T ). 

Proof. 

Let F0 (θ) be defined as follows: 

� 1 

F0 (θ) = ∆T θ2 if θ ≤ θ1 

2 

(θ − 2θ1) if θ >θ 1 

(the choice 2θ1 is more or less arbitrary) and let us consider the system 

� 

˙θ 

θ (0) 

= 

= 

F0 (θ) 

1 

. 

Simple computations give the solution: 

⎧ 

⎪⎨ 

∆T 

θ (t) = 

∆T − t 

⎪⎩ 

θ1∆T 

2θ1 − 

θ1t + (2 − θ1)∆T 

while θ ≤ θ1 

when θ >θ 1. 

1 

∆T 

50



Therefore θ(t) reaches θ1 at time t 0. The function µ is such that14 : 

⎧ 

⎨ 1 if I ≥ γ1 

µ (I) = ∈ [0, 1] 

⎩ 

0 

if γ0 ≤ I ≤ γ1 

if I ≤ γ0. 

We claim that all properties are satisfied. 

If I ≥ γ1, F (θ, I) =F0 (θ) ensuring Property 3, (refer to the beginning of the proof). 

Conversely, if I ≤ γ0, F (θ, I) =λ (1 − θ) then Property 4 is fulfiled. Moreover, because 

F (θ, I) is Lipschitz, Property 1 is verified. Let us check property 2 : 

� 

� 

� 

F (θ, I) 

� θ2 � 

� 

� 

� ≤ 

� 

�F0 

� 

(θ) 

� θ2 � 

� 

� 

� + 

� 

� 

� 

λ (1 − θ) 

� θ2 � 

� 

� 

� . (3.17) 

The first term satisfies: 

� 

� 

− θ ≤ θ1, � F0(θ) 

� 

� 

� = 1 

θ 2 

θ 2 

� 

� 

− � F0(θ) 

� 

� 

� = 1 

∆T 

∆T 

, and 

� �2 θ−2θ1 

θ ≤ 1 

∆T if θ ≥ θ1 (and θ < 2θ1). 

The second term satisfies: 

� 

� 

� 

λ (1 − θ) 

� θ2 � 

� 

� 

� 

− 1 1 

� = λθ = λ 

θ2 4 − 

θ2 4 − θ +1 

θ2 � 

⎛ 

= λ ⎝ 1 

4 − 

� � ⎞ 

θ 

2 

2 − 1 

⎠ ≤ 

θ 

λ 

4 . 

Property 2 is satisfied because of (3.17) with M = 1 

∆T 

13 When 1 < θ0 ≤ θ1 the solution to equation (5.11) is: 

And for θ1 < θ0 < 2θ1 the solution of (5.11) is: 

+ λ 

4 . 

� 

∆T θ0 

θ(t) = 

when θ(t) ≤ θ1 

∆T −θ0t 

θ(t) = 2θ1 − 

when θ(t) > θ1. 

θ(t) =2θ1 − 

∆T θ0θ1 

θ0θ1t+(2θ0−θ1)∆T 

∆T (2θ1 − θ0) 

∆T + t(2θ1 − θ0) . 

The conclusion remains the same: if θ0 < θ1, θ1 is reached in a time smaller than∆ T , and remains below 2θ1 

in all cases. 

14 Such a function is explicitely defined in Section 4.2.1 of Chapter 4. 

51


3.8 Proof of the Theorem 


First of all let us choose a time horizon d (in Id (t)) and a time T such that 0 < d < T < T ∗ . 

Set∆ T = T − d. Let λ be a strictly positive number and M = 1 

∆T 

+ λ 

4 

as in Lemma 42. Let 

α and β be the bounds from Lemma 39. 

From the preparation for the proof, inequality (3.12) can be written, using Lemma 39 

(i.e. using ˜ S ≥ α Id), and omitting the control variable u 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤−αqmθ˜ε ′ � � 

S˜ε ˜ ′ 

(t) + 2˜ε S˜ 

˜b (˜z) − ˜b (˜x) − ˜∗ b (˜z)˜ε . (3.18) 

From (3.18) we can deduce two inequalities: the first one, local, will be used when ˜ε ′ ˜ S˜ε (t) 

is small, whatever the value of θ. The second one, global, will be used mainly when ˜ε ′ ˜ S˜ε (t) 

is not in a neighborhood of 0 and θ is large. 

Using � � 

�� ˜b (˜z) − ˜b (˜x) − ˜∗ � 

b (˜z)˜ε � ≤ 2Lb �˜ε� , 

together with α Id ≤ ˜ S ≤ β Id (Lemma 39), (3.18) becomes the “global inequality” 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤ 

� 

−αqmθ +4 β 

α Lb 

� 

˜ε ′ S˜ε ˜ (t) . (3.19) 

Because of Lemma 41, we obtain the“local inequality” as follows: 

� 

� 

�˜b (˜z) − ˜b (˜x) − ˜b ∗ � 

� 

(˜z)˜ε � ≤ Kθ n−1 �˜ε� 2 . 

Since 1 ≤ θ ≤ 2θ1, inequality (3.18) implies 

Since �˜ε� 3 = 

� 3 

�˜ε� 

2� 2 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤ 

≤−αqm˜ε ′ S˜ε ˜ (t)+2K (2θ1) n−1 � � 

� 

�S˜ � 

� �˜ε� 3 . 

� 1 

α ˜ε′ ˜ S˜ε (t) 

� 3 

2 

, the inequality becomes 

˜ε ′ S˜ε ˜ (t) ≤−αqm˜ε ′ S˜ε ˜ 

2K (2θ1) 

(t)+ n−1 β 

α 3 

2 

Let us apply 15 Lemma 40 which states that if 

then, for any t ≥ τ, 

˜ε ′ ˜ S˜ε (τ) ≤ 

α5q2 m 

16 K2 (2θ1) 2n−2 , 

β2 ˜ε ′ ˜ S˜ε (t) ≤ 4˜ε ′ ˜ S˜ε (τ)e −αqm(t−τ) . 

� 

˜ε ′ � 3 

2 

S˜ε ˜ (t) . (3.20) 

15 This lemma cannot be applied if we use Qθ and R instead of Qθ and Rθ in the definition of the observer 

as it is done in [38]. This is due to the presence of a F 

θ 

for all times. 

term that prevents parameters k1 and k2 to be positive 

52



Consequently, provided there is a real γ such that 

� 

1 αε∗ γ ≤ 2n−2 min 

(2θ1) 4 , α5q2 m 

16 K2β 2 

� 

, (3.21) 

then ˜ε ′ ˜ S˜ε (τ) ≤ γ implies, for any t ≥ τ, 

˜ε ′ ˜ S˜ε (t) ≤ 

αε ∗ 

(2θ1) 2n−2 e−αqm(t−τ) . (3.22) 

Note that excluding the change of variables, we have arrived at the end result. 

From (3.19): 

˜ε ′ ˜ S˜ε (T ) ≤ ˜ε ′ ˜ S˜ε (0) e (−αqm+4 β 

α Lb)T , 

and if we suppose θ ≥ θ1 for t ∈ [T, T ∗ ], T ∗ >T, using (3.19) again: 

where 

Now, we choose θ1 and γ for 

˜ε ′ ˜ S˜ε (T ∗ ) ≤ ˜ε ′ ˜ S˜ε (0) e (−αqm+4 β 

α Lb)T e (−αqmθ1+4 β 

α Lb)(T ∗ −T ) 

≤ M0e −αqmT e 

M0e −αqmT e 

β 

4 LbT ∗ 

α e−αqmθ1(T ∗−T ) , 

M0 = sup 

x,z∈X 

ε ′ Sε (0) . (3.23) 

β 

4 LbT ∗ 

α e −αqmθ1(T ∗−T ) 

≤ γ (3.24) 

and (3.21) to be satisfied simultaneously, which is possible since e−cte×θ1 cte < 

θ 2n−2 

1 

enough. Let us chose a function F as in Lemma 42 with∆ T = T − d and γ1 = λ0 dγ β . 

We claim that there exists τ ≤ T ∗ such that ˜ε ′ ˜ S˜ε (τ) ≤ γ. 

Indeed, if ˜ε ′ ˜ S˜ε (τ) > γ for all τ ≤ T ∗ because of Lemma 33: 

γ < ˜ε ′ S˜ε ˜ 2 2 β 

(τ) ≤ β �˜ε (τ)� ≤ β �ε (τ)� ≤ 

λ0 Id (τ + d) . 

d 

for θ1 large 

Therefore, Id (τ + d) ≥ γ1 for τ ∈ [0,T ∗ ] and hence Id (τ) ≥ γ1 for τ ∈ [d, T ∗ ]. Thus, we have 

θ (t) ≥ θ1 for t ∈ [T, T ∗ ] which provides a contradiction (i.e. ˜ε ′ ˜ S˜ε (T ∗ ) ≤ γ) thanks to (3.8) 

and (3.24). 

Finally, for t ≥ τ, using (3.22) 

which proves the theorem. 

�ε (t)� 2 ≤ (2θ1) 2n−2 �˜ε (t)� 2 

2n−2 

(2θ1) 

≤ α 

≤ ε∗e−αqm(t−τ) 53 

˜ε ′ ˜ S˜ε (t) 

(3.25)


3.9 Conclusion 

Remark 44 (alternative result) 

We propose a modification of the end of the proof that allows �ε(0)� 2 to appear in the 

final inequality. 

Consider equation (3.21) and replace it with: 

� 

1 αε∗ γ ≤ 2n−2 min 

(2θ1) 4β , α5q2 m 

16 K2β 2 

� 

. (3.26) 

Then equation (3.22) becomes: 

˜ε ′ ˜ S˜ε (t) ≤ 

αε ∗ 

β (2θ1) 2n−2 e−αqm(t−τ) . (3.27) 

Now replace the definition of M0 given in equation (5.18) by M0 = max 

� 

supx,z∈X ε ′ 

� 

Sε (0) , 1 . 

Finally consider the very last inequality (3.25). It can also be developed as follows (with 

M0 ≥ 1): 

�ε (t)� 2 ≤ (2θ1) 2n−2 �˜ε (t)� 2 2n−2 

(2θ1) 

≤ ˜ε 

α 

′ S˜ε ˜ (t) 

2n−2 

(2θ1) 

≤ 4 ˜ε 

α 

′ S˜ε ˜ −αqm(t−τ) 

(τ)e 

2n−2 

(2θ1) 

≤ 4 

α 

2n−2 

(2θ1) 

≤ 4 

α 

2n−2 

(2θ1) 

≤ 4 

α 

� 

˜ε ′ ˜ S˜ε (0) e (−αqm+4 β 


α Lb)(T ∗ −T ) � 

e −αqm(t−τ) 

� 

β 

2 

β. �˜ε (0)� e (−αqm+4 α Lb)T 

e (−αqmθ1+4 β 

α Lb)(T ∗−T ) � 


� 

β 

2 

β. �˜ε (0)� M0e (−αqm+4 α Lb)T 


α Lb)(T ∗−T ) � 


From (3.24) we know that the bracketed expression is smaller than γ. Equations (3.26) and 

(3.27), θ(0) = 1 lead to: 

�ε (t)� 2 2n−2 

(2θ1) 

≤ 4 β �˜ε (0)� 

α 

2 

� 

≤�ε0� 2 ε ∗ e −αqm(t−τ) . 

αε ∗ 

4β (2θ1) 2n−2 

Since τ ≤ T ∗ , e −αqm(τ−T ∗ ) ≥ 1 and we can obtain the inequality 


�ɛ(t)� 2 ≤�ε0� 2 ε ∗ e −αqm(t−T ∗ ) . 

� 


In this chapter, the adaptive high-gain extended Kalman filter was introduced for a multiple 

input, single output system. The exponential convergence of the algorithm has been 

proven. This proof has been decomposed in a series of significant lemmas: 

54


1. the innovation at time t places an upper bound on the error at time t − d, 


2. the Riccati matrix S is bounded from above and below, for all times t, independently 

from θ, 

3. the set of candidate adaptation functions is non empty. 

In the next chapter, the description of the observer is completed with an adaptation 

function and the analysis of an example. This study is done both in a simulation and in 

a real process. We also propose guidelines to the tuning of the several parameters of the 

observer. 

55


Chapter 4 

Illustrative Example and Hard 

real-time Implementation 

Contents 

4.1 Modeling of the Series-connected DC Machine and Observability 

Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

4.1.1 Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

4.1.2 Observability Cannonical Form . . . . . . . . . . . . . . . . . . . . . 59 

4.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

4.2.1 Full Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . 60 

4.2.2 Implementation Considerations . . . . . . . . . . . . . . . . . . . . 61 

4.2.3 Simulation Parameters and Observer Tuning . . . . . . . . . . . . . 62 

4.2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

4.3 real-time Implementation . . . . . . . . . . . . . . . . . . . . . . . 76 

4.3.1 Softwares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 

4.3.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 

4.3.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

4.3.4 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 84 

4.3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 87 

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

56


4.1 Modeling of the Series-connected DC Machine and Observability Normal 

Form 

In this chapter, we focus now on the implementation of the adaptive high-gain extended 

Kalman filter that was introduced in Chapter 3. We provide a full definition of the observer, 

in which an adaptation function is explicitly given. A methodology is advanced for tuning 

the several parameters. Our goal is to demonstrate that the adaptive high-gain extended 

Kalman filter can be used in practice even in the case of a relatively fast process, e.g. 100 

Hz. 

The process we consider is a series-connected DC motor, modeled via a nonlinear SISO 1 

system when current and voltage are the only observables. This process has been used in 

previous studies, allowing us to compare our results here with those from the earlier works (see 

[87, 93]). Moreover, the machine itself is readily available and experiments can be considered 

quite realistic. Although the process is quite simple, the implementation of the observer in a 

real-time environment raises interesting questions that a simulation does not. 

The modeling itself of the process and the observability study are investigated in Section 

4.1. The implementation of the process in a simulation is the subject of Section 4.2. The 

methodology for the tuning the parameters is also developed in this section. Finally, a set of 

real experiments performed using an actual machine using a hard real-time operating system 

is detailed in Section 4.3. 

4.1 Modeling of the Series-connected DC Machine and Observability 

Normal Form 

Basically, an electric motor converts electrical energy into mechanical energy. In a DC 

motor, the stator (also denoted field) is composed of an electromagnet, or a permanent 

magnet, that immerses the rotor in a magnetic field. The rotor (also denoted armature) is 

made of an electromagnet that once supplied with current creates a second magnetic field. The 

stator is kept fixed while the rotor is allowed to move — i.e., rotate. The attraction/repelling 

behavior of magnets generates the rotative motion. 

In order to make the rotative motion permanent, one of the two magnetic fields has 

to be switched at appropriate moments. The magnetic field created by the stator remains 

fixed. The rotor windings are connected to a commutator causing the direction of the current 

flowing through the armature coils to switch during the rotation. This reverses the polarity 

of the armature magnetic field. Successive commutations then maintain the rotating motion 

of the machine. 

A DC motor whose field circuit and armature circuit are connected in series, and therefore 

fed by the same power supply, is referred to as a series-connected DC motor [77]. 

4.1.1 Mathematical Model 

The model of the series-connected DC motor is obtained from the equivalent circuit representation 

shown in Figure 4.1. We denote If as the current flowing through the field part 

of the circuit (between points A and C), and Ia as the current flowing through the armature 

circuit (between points C and B). When the shaft of the motor is turned by an external force, 

the motor acts as a generator and produces an electromotive force. In the case of the DC 

1 Single input single output. 

57



Form 

motor, this force will act against the current applied to the circuit and is then denoted back 

or counter electromotive force (BEMF or CEMF). The electrical balance leads to 

for the field circuit, and to 

for the armature circuit. The notations are: 

Lf ˙ 

If + Rf If = VAC 

La ˙ 

Ia + RaIa = VCB − E 

− Lf and Rf for the inductance and the resistance of the field circuit, 

− La and Ra for the inductance and the resistance of the armature circuit, 

− E for the Back EMF. 

Kirchoff’s laws give us the relations: 

� 

The total electrical balance is 

I = Ia = If 

V = VAC + VCB. 

L ˙ 

I + RI = V − E, 

where L = Lf + La and R = Rf + Ra. The field flux is denoted byΦ . We have Φ= f(If )= 

f(I), and E = KmΦωr where Km is a constant and ωr is the rotational speed of the shaft. 

The second equation of the model is given by the mechanical balance of the shaft of the 

motor using Newton’s second law of motion. We consider that the only forces applied to the 

shaft are the electromechanical torque Te, the viscous friction torque and the load torque Tl 

leading to 

J ˙ 

ωr = Te − Bωr − Tl 

where J denotes the rotor inertia, and B the viscous friction coefficient. The electromechanical 

torque is given by Te = KeΦI with Ke denoting a constant parameter. We consider that 

the motor is operated below saturation [93]. In this case, the field flux can be expressed 

by the linear expression Φ= Laf I, where Laf denotes the mutual inductance between the 

field and the rotating armature coils. To conclude with the modeling of the DC, motor we 

impose the ideal hypothesis of 100% efficiency of conservation of energy, which is expressed 

as K = Km = Ke. For simplicity in the notation, we write Laf instead of KLaf . The voltage 

Rf 

Lf 

La 

Figure 4.1: Series-connected DC Motor: equivalent circuit representation. 

58 

Ra 

C 

A 

+ 

- B



Form 

is the input of the system, u(t), and the current, I(t), is the measured output. The resulting 

following SISO model for the series-connected DC motor is: 

⎧ � � � 

⎨ LI˙ u − RI − Laf ωrI 

= 

J ˙ωr 

Laf I 

⎩ 

2 � 

− Bωr − Tl 

(4.1) 

y = I 

This model is used to simulate the DC motor by means of a Matlab/Simulink S-function. 

4.1.2 Observability Cannonical Form 

Before implementing the observer to reconstruct the state vector of this system, we test the 

system’s observability property. We use the differentiation approach, i.e. we check differential 

observability (which implies observability): 

− if I(t) is known with time, then ˙ 

I =(u − R.I − Laf ωrI)/L is known and as long as u(t), 

R, Laf and L are known then ωr can be computed, 

− because ωr(t) is known, ωr ˙ =(Laf I2 − Bωr − Tl)/J can also be computed. From the 

knowledge of I(t), Laf , B,and J, then Tl can be estimated. 

We conclude that a third variable can be added to the state vector in order to reconstruct 

the load torque applied to the shaft of the motor along with the state of the system. We 

assume that the load torque is constant over time. Sudden changes of the load torque then, 

are interpreted as non modeled perturbations. The estimation of the load torque is made 

possible including the constraint ˙ 

Tl = 0 in equation (4.1). We now need to find the coordinate 

transformation that puts this systems into the observability canonical form. 

From the equation y = I, we choose x1 = I and then 

which by setting x2 = Iωr becomes 

x1 ˙ = 1 

L (u(t) − RI − Laf Iωr), 

x1 ˙ = − Laf 

L x2 + 1 

L (u(t) − Rx1) =α2(u)x2 + b1(x1,u). (4.2) 

We now compute the time derivative of x2: 

x2 ˙ = ˙ Iωr + Iωr ˙ = − 1 

J TlI − B 

J Iωr + Laf 

J I3 − Laf 

L ω2 rI + u(t) 

L ωr − R 

L ωrI 

provided that I>0 (i.e. x1 > 0). This constraint represents a reasonable assumption since 

when I, the current of the circuit, equals zero there is no power being supplied to the engine 

and therefore there is nothing to observe. We have ωr = x2 

x1 , and x3 = TlI. The above 

equation then becomes 

x2 ˙ = − 1 

J x3 − B 

J x2 + Laf 

J x31 = α3(u)x3 + b2(x1,x2,u) 

59 

− Laf 

L 

x 2 2 

x1 

u(t) x2 R 

+ L − x1 L x2 

(4.3)


4.2 Simulation 

again provided that I>0 (i.e. x1 > 0). This leads us to the expression Tl = x3 . Recall that 

x1 

Tl 

˙ = 0, then 

Thus the application: 

x3 ˙ = − Laf x2x3 

L x1 

+ u(t) x3 

− 

L x1 

R 

L x3 = b3(x1,x2,x3,u). (4.4) 

R ∗+ × R × R → R ∗+ × R × R 

(I,ω r,Tl) ↩→ (I, Iωr, ITl) 

is a change of coordinates that puts the system (4.1), into the observer canonical form2 defined by equations (4.2), (4.3), (4.4). 

The inverse application is: 

� 

(x1,x2,x3) ↩→ x1, x2 

, 

x1 

x3 

� 

. 

x1 

Computations of the coefficients of the matrix b ∗ 

that appears in the Riccati equation of the 

observer — Cf. next section — are left to the reader. 


4.2.1 Full Observer Definition 

We now recall the equations of the adaptive high-gain extended Kalman filter. As we want 

to minimize the computational time required to invert the matrix S, let us define P = S−1 . 

The identity dP dS−1 

dt = dt 

dS = S−1 

dt S−1 allows us to rewrite the observer as: 

⎧ 

⎪⎨ 

⎪⎩ 

dz 

dt = A(u)z + b(z, u)+PC′ R −1 

dP 

dt 

θ (Cz − y(t)) 

= P (A(u)+b∗ (z, u)) ′ +(A (u)+b∗ (z, u))P − PC ′ R −1 

dθ 

dt 

θ CP + Qθ 

= µ(Id)F0(θ) + (1 − µ(Id))λ(1 − θ) 

(4.5) 

where 

− Rθ = θ −1 R, 

− Qθ = θ∆ −1 Q∆ −1 , 

− ∆θ = diag �� 1, θ,θ 2 , . . . ,θ n−1�� , 

− F0(θ) = 

� 1 

∆T θ2 ifθ ≤ θ1 

1 

∆T (θ − 2θ1) 2 if θ >θ 1 

, 

− µ(I) = � 1+e −β(I−m)� −1 is a β and m parameterized sigmoid function (Cf. Figure 

4.10), 

2 One could ask about the compact subset required by Theorem 36. In the present situation, a compact 

subset would be a collection of three closed and bounded intervals. The problem arises from the exclusion of 

0 as a possible value for x1. This is solved by picking any small ɛ > 0 and considering that for I = x1 < ɛ the 

motor is running too slowly to be of any practical use. Those trajectories are now in a compact subset of the 

state space. 

60



− the innovation, Id, is defined by the formula: 

� t 

Id(t) = �y(s) − ˆyt−d(s)� 2 ds (4.6) 

where 

t−d 

– y(s) is the output of the DC machine (the current), 

– ˆyt−d(s) =x1 (t − d, z(t − d),s) with x (t − d, z(t − d),s) is the solution of the normal 

form equations (4.2), (4.3), (4.4) over the time window [t − d, t], with the 

initial condition that x(t − d) =z(t − d), is the estimated state at time t − d. 

4.2.2 Implementation Considerations 

z(t 

dz/dt= 

dS/dt= 

d!/dt= 

u(t 

d delay 

Innovation 

Figure 4.2: Observer Structure. 

The simulation of the DC motor is straightforward. Thus, here we only comment on 

the implementation of the observer. The computation of a solution for the algorithm above 

is decomposed in two main parts (see Figure 4.2): 

1. the computation of the innovation, Id(t), 

2. the update of the estimated state, the elements of Riccati matrix and the variable θ. 

The latter part is common to all continuous time Kalman style observers. It presents no 

particular difficulties 3 , and is illustrated in Figure 4.3. 

In order to compute the innovation (refer to Figure 4.4) we need to: 

3 From theory, we know that the Riccati matrix is symmetric. Therefore we can solve the equation for only 

either the upper triangular part or the lower triangular part of the matrix.The introduction of this artifice has 

two advantages: 

− it saves computational time (the solution of n(n+1) 

equations are needed instead of n 2 

2 ), 

− because of tiny machine inaccuracies, the situation may arise when the coefficients, which are supposed 

to be equal, are determined to be unequal. This situation violates the symmetric requirement of the 

matrix and algorithm breaks down. 

To solve the matrix equations, the implementation of the observer requires the use of a small utility that 

transforms (n × n) square matrices into vectors, and vice-versa. 

Another device, which can be used when considering discrete-time systems, is to work directly with the square 

root of the Riccati matrix. This method is detailed in [43], Chapter 7.2. 

61 

y(t



single objective contrary to the multi-objective tuning required for non adaptive observers. 

The parameters are split into two categories (see Figure 4.6) : 

− the ones defining the performance of the system with respect to noise and perturbations: 

– matrices Q and R are meant to provide decent noise filtering 4 , 

– θ1 characterizes the performance of the high-gain mode, 

− the parameters related to the adaptation procedure: 

– computation of innovation: d, the delay, and δ the time discretization used to 

calculate Id, 

– the rising time of θ: ∆T ( ∆T appears in the function F0(θ) of the observer (4.5)), 

– the sigmoid function parameters: β, m = m1 + m2, 

– the speed of the decay of θ(t) when innovation is small: λ. 

Let us propose a methodology for the tuning of our set of parameters. 

Without the 

adaptation equation 

⎧ 

⎨ 

⎩ 

R 

Q 

θ1 

⎧ 

⎪⎨ 

⎪⎩ 

Parameters of the 

adaptation equation 

β 

λ 

∆T 

d 

m1 

� δ 

m2 

θ1 

STEP STEP 2 STEP 3 

Figure 4.6: Bold: crucial parameters. 

1. Non adaptive parameter tuning (Q, R and θ1). 

At this stage simulations/experiments are done using the observer with F(θ, Id) = 0 

(i.e. it is an extended Kalman filter, high-gain if θ(0) > 1). 

First, we tune the classical EKF by making some simulations with noise, or by 

choosing some experimental data sets without large perturbations. Our goal here is to 

achieve acceptable smoothing of the noise. The input signal of the DC machine is set 

to u(t) = 120 + 12 sin(t). This condition makes the system state oscillate and eases the 

burden of the graphical analyses 5 . The observer’s initial state is taken to be equal to 

the system’s initial state whenever the initial state is known. For all these simulations, 

the load torque is kept constant equal to 0.55. We refer to this situation as the first 

scenario. 

The matrices Q and R are required to be symmetric definite positive. Usually, in 

the continuous case, they are considered diagonal with positive coefficients: 

4 

In the stochastic setting, Q and R are the covariance matrices of the state (resp. output) noise. In our 

deterministic point of view, they constitute additional tuning parameters. 

5 

This is not a requirement at all. It’s only here to provide a non stationary signal as the output. This step 

can also be performed with a stationary system. 

64


− R is set in order to reflect the measurement noise covariance, 


− the diagonal coefficients of Q are made larger for the state variables, which are 

unknown or for which the model is less accurate (in our case, this would be the 

load torque). 

We decided to set R = 1 and attempted several different configurations for Q. Figure 

4.7 shows a plot of our estimate of the second state variable (red line) as compared to the 

real values (black line) for several simulations. Comparing all the results in the figure, 

we consider that the tuning in the third panel (from the top) provided the observer 

with optimal noise smoothing properties. The values of Q used in the subsequent 

experiments (fourth and fifth panels from the top) do not improve the performance, 

while also making the observer really slow. 

Contrary to what is normally done (see for example the application part of [38]), we 

choose values for the Q matrix, which are much larger for the measured variable than 

those that are unknown. By choosing the values in such a way, the observer is retarded 

with respect to the observable, which mean that those parameters would converge quite 

slowly, as compared to a bad initialization and/or sudden changes in the load torque. 

The high-gain mode is meant to cope with those situations. 

In order to determine an initial value for θ1, we simulate large disturbances in a 

second scenario. The input variable is kept constant 6 (V = 120), and the load torque 

is increased from 0.55 to 2.55 at the time step=30. The initial state is still considered 

as being equal to the actual system state. Q and R are set to be the values chosen 

previously. The high-gain parameter, θ(0) = θ0, is then chosen in order to achieve the 

best observer time response during this disturbance. The performance with respect 

to noise is ignored. The data corresponding to the estimate of the load torque (third 

state variable) are plotted in Figure 4.9. We expect the noise to amplify the overshoot 

problem. Thus, we therefore try to select θ0 such that offsets are avoided as much as 

possible. 

In the definition of the function F0 of the observer (4.5), θ1 has to be set such that 

2θ1 = θ0 where θ0 denotes the value we just found. 

2. Sigmoid function, innovation and adaptive procedure. 

We now consider the fully implemented observer with ˙ θ = F(θ, Id) and θ(0) = 1. 

Several parameters can be set in this case regardless of the application: 

− β and m: recall that according to Lemma 42 of Chapter 3 the function µ(Id) 

should possess the following features: 

⎧ 

⎨ 

µ(Id) = 

⎩ 

1 ifγ 1 ≤ Id 

∈ [0, 1] ifγ 1 ≤ Id < γ0 

0 if Id < γ0 

6 This not a requirement at al. The input variable may still be considered to be varying when changes in 

the load torque are made. 

65


400 

300 

200 

0 

400 

10 20 30 40 50 60 

300 

200 

0 

400 

10 20 30 40 50 60 

300 

200 

0 

400 

10 20 30 40 50 60 

300 

200 

0 

400 

10 20 30 40 50 60 

300 

200 

0 10 20 30 40 

"& 

"! 

& 

! 

Real values 

Estimated state 

Figure 4.7: Tuning of the Q and R matrices. 


Q = diag(1,1,5) 

Q = diag(1,1,1) 

Q = diag(1,10 -1 ,10 -2 ) 

Q = diag(1,10 -2 ,10 -4 ) 

Q = diag(1,10 -3 ,10 -6 ) 

!& 

! "! #! $! 

Rotation speed (Wr) 

%! &! '! 

%!! 

#!! 

! 

!#!! 

! "! #! $! 

Load torque (Tl) 

%! &! '! 

$ 

# 

" 

Curent (I) 

! 

! "! #! $! %! &! '! 

Figure 4.8: Sample of a simulation (scenario 1). 

66


2.5 

2 

1.5 

1 

0.5 

Load torque estimate 

Overshoots 

10 10.5 11 11.5 12 12.5 13 13.5 14 14.5 

Figure 4.9: Choice of a value for 2θ1. 


Real values 

! 0 = 5 

! 0 = 10 

! 0 = 8 

! 0 = 15 

We chose a sigmoid function whose equation appears in the definition of the observer 

(4.5). A graphical representation is displayed in Figure 4.10. The parameters 

β and m play the same role as the bounding parameters, γ0 and γ1, in the 

properties of the innovation shown above. The first parameter, β, controls the 

duration of the transition part of the sigmoid. The higher β is, the smaller is the 

transition. In practice, the best results are obtained for a small transition time 

(i.e. a large value of β) 7 . 

When m = 0, µ(0) = 0.5. The role of the parameter m is to pull the sigmoid to 

the right. m is actually divided into two components, i.e., m = m1 +m2. We want 

m1 is to be such that µ(Id) ≈ 0 when Id is around 0. Because of the presence of 

noise in the measured signal, we need to add m2 to this value. The procedure is 

explained below. The choice of m1 can be made either via a graphical method or 

by solving a nonlinear equation 8 . 

− λ: Contrary to the observer of [38] “λ small enough” is no longer required, since 

θ increases when estimation error becomes too large. However, the situation may 

arise when the innovation oscillates up and down after some large disturbance. 

7 

Note that the need for a small transition time is consistent with the requirements expressed in Theorem 

36 of Chapter 3, i.e., when neither θ is high-gain nor the estimation error is sufficiently small, the state of the 

system is unclear. We want to reach one of those two domains. 

8 

Let us choose 0 < ε1 < 1, small and l, a transition length. We keep m = 0 and solve the equation 

µ(l/2) − µ(−l/2) = (1 − ε1) − ε1 in order to find the corresponding β. 

Now choose ε2 > 0, sufficiently small, is used as the zero value (the function µ(x) = 0 has no solution). Set β 

to the value found previously and solve for m using the equation µ(0) = ε2. 

67



With a high value of λ, θ oscillates as well. In order to give some resilience to θ 

in this case, λ should be not be set too high. A value between 1 and 10 seems to 

be sufficient. 

− ∆T : (in the adaptation function of the observer (4.5)). The smaller∆ T , the 

shorter the rising time of θ. We take∆ T =0.1. This is sufficiently small that the 

equation F0(θ) remains compatible with the ODE solver 9 . 

As explained at the end of the previous subsection (i.e. Subsection 4.2.2), innovation 

is considered as a discrete function of time. Therefore we need to set the sampling time 

of this process. We denote it δ. It depends on the measurement hardware and is 

not a critical parameter. Nevertheless, it seems intuitive that d 

δ must be sufficiently 

d 

high to at least reflect the rank of the observability of the system, i.e. δ ≥ n − 1. 

Indeed, innovation is used as a direct measurement of distinguishability. A theoretical 

justification of this remark is provided in Section 5.2 of Chapter 5 where an adaptive 

continuous-discrete version of our observer is provided. 

Parameters d and m2 are closely related to the application, but in a very clear manner. 

When d is too small, innovation is not sufficiently large to distinguish between an 

increase in the estimation error and the influence of noise. On the other hand, a value 

for the innovation that is too high increases the computation time as the prediction is 

made on a larger time interval. The value of d has to be chosen using our knowledge of 

the time constant of the system (simulations or data samples), i.e., some fraction (e.g. 

) of the smallest time constant appears to be a reasonable choice. 

1 

3 

to 1 

5 

Remark 45 

When the system encounters a high perturbation at time t, one would expect a delay 

of length d in the adaptation of θ. However, the inequality of Lemma 33 is valid for 

any delay 0


! 

"#+ 

"#$ 

"#* 

"#% 

"#) 

"#& 

"#( 

"#' 

"#! 

! , )- . ," 

! , !"- . ," 

! , '"- . ," 

! , '"- . ,"#& 

! , '"- . ,"#$ 

" 

!! !"#$ !"#% !"#& !"#' " "#' "#& "#% "#$ ! 


Figure 4.10: Effect of the parameters β and m on the shape of the sigmoid. 

output signal without noise. In the case where the output signal is corrupted by noise 

v(t), we have 

ymes(t) =y(t − d, x(t − d), τ)+v(τ). 

Therefore with x(t − d) =z(t − d): 

Id(t) = � t 

t−d �y(t − d, x(t − d), τ)+v(τ) − y(t − d, z(t − d), τ)�2 dτ 

= � t 

t−d �v(τ)�2 dτ �= 0. 

(4.7) 

We use σ to denote the standard deviation of v(t). We estimate m2 is the threesigma, 

which seams reasonable and empirically sound. Then from equation (4.7), we 

obtain the relation Id(t) ≤ 9σ 2 d. Therefore, m2 ≈ 9σ 2 d appears to be a reasonable 

choice. However, practice demonstrates that m2 computed in this way is over estimated. 

Although less common in engineering practice, we advise using a one-sigma rule 10 : 

m2 ≈ σ 2 d. 

3. Final tuning 

All the parameters being set as before, we run a series of simulations with the output 

signal corrupted by noise. The parameter load torque is changed suddenly from 0.55 to 

2.55 (scenario 2). We modify θ1 in order to improve (shorten) convergence time of the 

observer when the system faces perturbations. Overshoots are kept as low as possible. 

Remark 46 

This methodology can also be applied for a hardware implementation. In the case where 

a complete simulator for the process is absent, the observer can be tuned in an open loop: 

− when the plant is operating, more or less, in steady state, in order to tune the parameters 

related to noise filtering, 

− when a perturbation occurs in order to set parameters related to adaptation. 

10 In the present example: σ = 2 and d = 1, thus m2 = 4. 

69


Parameter Value Role 

Q diag(1, 10 −1 , 10 −2 ) Filtering 

R 1 Filtering 

θ1 4 High-gain 

β 1664 π 

e Adaptation* 

m1 0.005 Adaptation* 

m2 4 Adaptation 

λ 5 Adaptation* 

∆T 0.1 Adaptation* 

d 1 Innovation 

δ 0.1 Innovation* 

Table 4.1: Final choice of parameters (*: Application-free parameters). 

4.2.4 Simulation Results 

The performance of the observer is accounted for via two scenarios: 

− scenario 2: a single change in the load torque is performed at time 30, 


− scenario 3: a series of changes are implemented every 20 units of time. The sequence 

of the values taken by Tl is [0.55, 2.5, 1.2, 1.5, 3.0.8, 0.55]. 

The output of the system for each scenario is displayed in Figure 4.12. The estimation 

results are displayed in 

− Figures 4.13 and 4.14, for scenario 2, 

− Figures 4.15 and 4.16, for scenario 3. 

In all the figures, the thick black line corresponds to the real values of the state variables. In 

each case the behaviors of several observers are shown: 

− dark blue plot: estimation rendered by an extended Kalman filter, with Q and R 

matrices as given in Table 4.1, 

Scenario 1 u(t) = 120 + 12 sin(t), Tl(t) = 0.55 ∀t ≥ 0 

Scenario 2 u(t) = 120, Tl(t) =0.55 ∀t ∈ [0; 30[ then Tl(t) = 

2.55 ∀t ∈ [30; 100[ 

Scenario 3 u(t) = 120, Tl(0) = 0.55, Tl(k20), 

k > 0 changes according to the sequence 

[0.55, 2.5, 1.2, 1.5, 3.0.8, 0.55] 

Figure 4.11: The several simulation scenarios. 

70



− light blue curve: estimation rendered by a High-gain extended Kalman filter with θ = 7, 

− red line: estimation done by the adaptive high-gain Kalman filter. 

15 

10 

5 

Second scenario output signal 

0 

0 10 20 30 40 50 60 70 80 90 100 

15 

10 

5 

Third scenario output signal 

0 

0 20 40 60 80 100 120 140 160 180 200 

Figure 4.12: Output signal for the two scenarios. 

In the first scenario, when the perturbation occurs (at time 30), we see that, as expected, 

the perturbation is detected by innovation, which crosses the y = m2 red line of Figure 4.17. 

The high-gain parameter then increases and convergence is made more effective. We see 

that when no perturbation occurs, innovation remains less than m2 and the behavior of the 

observer is the same as the one of the extended Kalman filter 11 . The speed of convergence 

after the perturbation is comparable to that of the high-gain extended Kalman filter modulo, 

i.e., a small delay that corresponds to the time needed: 

1. for the perturbation to have an effect on the output, 

2. for the perturbation to be detected, 

3. and for the high-gain to rise. 

Notice that this delay depends also on the parameter δ, the sample time set for the computation 

of the innovation. Since θ reacts on behalf of the innovation, its behavior can change 

only every δ period of time. 

11 Actually, the performance of the AEKF versus that of the EKF depends also on the level of the noise and 

on the system under consideration. For example, if the noise level is really high, one would probably set the 

matrix Q to a very low value thus rendering the EKF even slower. This very low value of Q has only a little 

effect on the AEKF behavior in high-gain mode. Therefore the AEKF would be as quick to respond as in the 

present situation and the EKF would be slower. In [21] and [22], the AEKF is used in some other examples 

thus providing additional insight into the differences between EKF and AEKF. 

71


/ # 

0 $ 

&!! 

%&! 

%!! 

$&! 

$!! 

#&! 

#!! 

012,-32,45 6,27 3 89!0:; 

53= 

012,-32,45 ?,3 0:; 

012,-32,45 6,27 @0:; 

"&! 

! "! #! $! %! &! 

+,-. 

'! (! )! *! "!! 

$+& 

$ 

#+& 

# 

"+& 

" 

!+& 

! 

!!+& 

!" 

Figure 4.13: Scenario 2: Estimation of the rotation speed. 

123-.43-56 7-38 4 9:!1;< 

=/4> 2-?64> 

123-.43-56 @-4 1;< 

123-.43-56 7-38 A1;< 

!"+& 

! "! #! $! %! &! 

,-./ 

'! (! )! *! "!! 

Figure 4.14: Scenario 2: Estimation of the torque load. 

72 

4.2 Simulation



In the second scenario, we notice that at some unexpected moments the adaptive highgain 

observer has the same behavior as that of the non high-gain observer (e.g. Figure 4.15). 

When we take a look at Figure 4.18, we see that θ didn’t actually increase for t ∈ [40; 80] and 

t ∈ [100; 160]. The explanation lies in Figure 4.12. The sudden change in the torque load 

wasn’t sufficient enough to have a significant effect on the output signal. 

As in the previous scenario, the adaptive observer presents two advantages with respect 

to the two other filters, namely that of improved noise rejection and that of increased speed 

of convergence in the event of perturbations. 

- " 

'!! 

#'! 

#!! 

('! 

(!! 

"'! 

"!! 

&'! 

./0*+10*23 4*05 1 67!.89 

:,1; /*.89 

&!! 

! "! #! $! %! &!! 

)*+, 

&"! &#! &$! &%! "!! 

Figure 4.15: Scenario 3: Estimation of the rotation speed. 

We conclude this section with Figure 4.19, which shows the estimation obtained from a 

adaptive high-gain observer with a poorly chosen value for m2. Since it is too small, θ is 

increasing when it is not needed. The corresponding innovation plot is provided in Figure 

4.20. 

73


, ' 

m2 

# 

' 

" 

& 

! 

-./)*0/)12 3)/4 0 56!-78 

!& 

9+0: .);20: 

-./)*0/)12


m2 

- " 

- ' 

8 

6 

4 

2 

Evolution of the ! 

0 

0 20 40 60 80 100 120 140 160 180 200 

8 

6 

4 

2 

(!! 

#!! 

'!! 

"!! 

Innovation 

0 

0 20 40 60 80 100 120 140 160 180 200 

Figure 4.18: Scenario 3: Innovation. 

&!! 

! "! #! $! %! &!! 

)*+, 

&"! &#! &$! &%! "!! 

# 

' 

" 

& 

! 

!& 

! "! #! $! %! &!! 

)*+, 

&"! &#! &$! &%! "!! 


Figure 4.19: Estimation with a adaptive extended Kalman filter with a too low m2 

parameter. 

75


% 

$ 

# 

" 

# 

m2 = 1.8 

+(,-!,.(/ 0.1.)*2*1 3!4 

4.3 real-time Implementation 

! 

! 

% 

"! #! $! %! &!! 

'()* 

5//67.2(6/ 

&"! &#! &$! &%! "!! 

$ 

" 

! 

! "! #! $! %! &!! 

'()* 

&"! &#! &$! &%! "!! 

Figure 4.20: Innovation for a low m2 parameter. 


In order to investigate in depth the way the adaptive high-gain extended Kalman filter 

works, we implemented the filter on a DC machine in our laboratory, at the University of 

Luxemburg. The issues which concerned us when performing those experiments were: 

− the study of the feasibility of a real time implementation, since: 

1. a solution of the Riccati equation requires the integration of n(n+1) 

2 differential 

equations, where n denotes the dimension of the state space, 

2. the computation of innovation appears to be time intensive 12 , 

− the study of the influence of unknown and non-estimated modeling errors on the adaptation 

scheme and finding a way to handle them. 

We considered the real-time part of this problem as being the major issue of this set of 

experiments. Therefore we decided to use a hard real-time operating system. 

4.3.1 Softwares 

In computer science, real-time computing is the study of hardware and software systems 

that are subject to operational deadlines from event to system response. The real-time 

framework can be divided into two parts: soft and hard real-time. A requirement is considered 

12 This question was also raised by M. Farza and G. Besançon at an early stage of this work, at the occasion 

of the conference [25]. 

76



to be hard real-time whenever the completion of an operation after its deadline is considered 

useless. When limited delays in the response time can be accepted, or in other words when 

we can afford to wait for the end of computations, the system is said to be soft real-time. 

The most common approach consists of defining a real-time task and using a clock that 

sends signals to the system at a frequency set by the user. Each time a signal is sent by the 

clock, a real-time task is launched whatever the conclusion of the previous task. In other 

words, a hard real-time system does not make a task conditional with respect to the real-time 

constraints. The system forces the designer of the task to respect the real-time constraints. 

We chose a Linux based real-time engine provided with a full development suite: RTAI-Lab 

[34]. 

RTAI-Lab is composed of several software components including: 

− RTAI [5]: RTAi is a user friendly RealTime Operating System. 

The linux O.S. suffers from a lack of real time support. To obtain real time behavior, it is 

necessary to change the kernel source code. RTAI is an add-on to the Linux Kernel core 

that provides it with the features of an industrial real-time operating system within a 

full non real-time operating system (access to TCP/IP, graphical display and windowing 

systems, etc...). 

Basically RTAI is a non intrusive interrupt dispatcher, it traps the peripheral interrupts 

and when necessary re-routes them to Linux. It uses a concept called Hardware 

Abstraction Layer to get information into and out of the kernel with only a few dependencies. 

RTAI considers Linux as a background task running when no real time activity 

occurs. 

− Comedi [4]: Comedi is a collection of drivers for a variety of common data acquisition 

plug-in boards. The drivers are implemented as a core Linux kernel module. 

− Scilab/Scicos [6–8, 41]: Scilab is a free scientific software package for numerical computation 

similar to Matlab. The software was initially developed at INRIA and is now 

under the guidance of the Scilab consortium (see the history section of [8]). 

Scicos is a a graphical dynamical system modeler and simulator (or Computer Aided 

Control System Design Software) developed by the group METALAU at INRIA. It 

provides a block oriented development environment that can be found either embedded 

into Scilab or in the distribution ScicosLab 13 . 

− RTAI-Lib [5, 34]: RTAI-Lib is a Scicos palette, i.e. a collection of blocks to use with 

Scicos. This palette is specific to the real-time issues RTAI is dealing with. These 

blocks can be used to generate a real-time task (which is not the case of the regular 

Scicos blocks). 

− Xrtailab [34]: Xrtailab is a oscilloscope-like software that takes care of communications 

between the non real-time part of the platform (i.e. the Linux O.S, the graphic displays) 

and the real-time executable when it is active. With Xrtailab, it is possible to plot and 

record signals and change online the parameter values of the simulation blocks (e.g. PI 

and PID coefficients, activate braking). 

13 Notice that recent versions of Scilab (2010) doesn’t seem to include Scicos anymore but some similar 

utility called xcos. 

77


Scicos call from 

scilab console 

Scilab call from the Konsole terminal 


Real-time clock 

Real-time task 

Superblock 

Figure 4.21: Graphical Implementation of a real-time task. 

For portability and flexibility reasons, we used a Linux Live CD 14 comprising the RTAI- 

Lab suite: RTAI-Knoppix 15 [3, 94]. When we operate in realtime, that is to say when an 

observer is running, the RT tasks don’t require any hard drive access. Consequently there is 

no difference between the linux live CD and a regular linux installation. 

The development of a real-time executable is done from Scicos, launched from Scilab 

as shown on Figure 4.21. A Scicos diagram, which is meant to be compiled as a real-time 

application is composed of two blocks: 

− an external clock (in red), 

− a Scicos superblock that contains the whole real-time task (in black). 

The only input to this block is the external clock signal. Communication between the system 

and the real-time task is done using specific blocks (signal generation, Scopes, Analog/Digital 

and Digital/Analog blocks [34]). They can be found in the RTAI-Lib palette 16 . 

The graphical program obtained is compiled into a real-time executable with the help of 

an automatic code generator. Figure 4.22 shows the three steps of the compilation: 

14 A Live CD is an O.S. that deploys directly from the CD. No specific installation is needed on the host 

machine. The programs and real-time tasks can be provided via an external storage source as a USB key. 

In short, as far as the various softwares are concerned, the only hardware devices required are the CD with 

RTAI-Knoppix and a USB key. 

15 The version we used was built on a Linux kernel, 2.6.17 (SMP enabled kernel is available) and embedded 

with 

− RTAI version 3.4 

− Scilab-4.0/Scicos CACSD platform. 

16 In Scicos language, a palette is a collection of predefined blocks. 

78


− the selection of the compilation options, 


− the setting of the real-time parameters (sample time of the real-time executable,...), 

− the display of a successful compilation notification. 

The program that was compiled above is called Phd DEMO. This real-time executable is 

started from the system console with one of the two following command lines: 

− ./Phd DEMO -f xx 

− ./Phd DEMO 

In the first case, the -f option specifies that a duration of execution is provided. The xx 

symbols have to be replaced by the desired length of time, in seconds. In the second case, no 

duration is given and the program runs until it is stopped by the user. In RTAI-Lab, there 

is only one way to properly stop a real-time executable while it is running, i.e., to use the 

Xrtailab application. 

As for Scilab, Xrtailab can be launched from a system console. The use of Xrtailab is 

illustrated in Figure 4.23. The two first images give an account of the connection procedure. 

The last image demonstrates a few possibilities of Xrtailab such as the display of signals 

entering a Scope block or the ability to change blocks parameters on the fly. 

A more detailed explanation on how the whole RTAI-lab suite works may be found in 

[34]. 

4.3.2 Hardware 

The testbed is composed of 

− a DC motor from Lucas Nuelle (ref. SE2665-5C) that can be connected in series. This 

machine is coupled 

– on one end with a tachometer 17 (ref.2662-5U), 

– on the other end with a propeller 18 (47.0 x 30.5 cm, together with a 5 ◦ pitch). 

The propeller is attached to a 52mm center hub. Those parts were manufactured 

by Aero-naut (ref.7234/97), 

− a programmable DC source from Delta Electronika (SM-300-10D), 

− an I/O card from National Instruments (6024E-DAQ) that allows for communications 

between the physical system and the control system. 

A communication diagram showing the relationship between the different elements of the 

testbed is provided in Figure 4.24. The figure shows: 

17 

The tachometer is only used in order to compare the estimated speed to the real one, and to calibrate the 

mathematical model (see Subsection 4.3.3) . 

18 

The brake we had at our disposal was not working properly. We therefore decided to pursue the experiments 

without it. As it can be seen from the second equation of system (4.1), if the resistive torque is not 

sufficient, there are good chances for the machine to run wild. The role of the propeller is to provide the motor 

with a sufficiently high and stable resistive torque. We chose our propeller on behalf of the study [44]. 

79


Figure 4.22: Compilation of a real-time task. 


− that measurements of the current, voltage and speed are fed to the hard real-time O.S., 

− and that set values for the voltage are delivered to the power supply. 

The hard real-time O.S. has therefore 3 inputs and 1 output signals. 

Figure 4.25 displays both a picture of the testbed. Also shown in the photograph is the 

the friction tool n ◦ 1. The undetermined perturbations were produced by means of a hand 

applied braking friction on the (back) motor’s shaft. 

The computer was a Dell PC equipped with a P. IV, 3GHz processor and a 512 Mb DDR2 

SDRAM memory. 

4.3.3 Modeling 

A model for the series-connected DC machine has been proposed in Section 4.1.1 above. 

We now need to adapt this model to the testbed as the presence of the propeller needs to be 

taken into account. This model can be written, in a short form, 

� 

I ˙ 1 = L (V − R.I − Laf Iωr) 

ωr ˙ = 1 

J (Laf Iωr − Tres) 

where Tres is the overall resistive torque. The model from Section 4.1.1 is changed in the two 

following ways: 

80


Figure 4.23: Xrtailab. 


1. we cannot keep the ideal efficiency of the machine assumption any longer. Therefore, 

instead of considering a unique mutual inductance Laf we separate it into two distinct 

parameters: Laf1 and Laf2 , 

2. the resistive torque Tres is modeled as the sum of 

− a viscous friction torque generated by the contacts inside the motor: Bωr, 

− a torque due to the presence of the propeller and which is modeled according to 

), 

the technical report [44] as (pω 2.08 

r 

− an unknown perturbation torque (i.e. braking), just as before: Tl. 

The final model thus obtained is 

� ˙ 

I = (V − R.I − Laf1 Iωr)/L 

ωr ˙ = (Laf2Iωr − Bωr − pω2.08 r 

− Tl)/J. 

Neither the observability analysis nor the change of variables that brought the model 

into its normal form are affected by these modifications. Therefore, in the same manner as 

before, we can estimate the load torque Tl by adding a third equation. The corresponding 

81


! 

!" # 

Figure 4.24: Connections Diagram. 


Figure 4.25: A view on the testbed (and, on the left, the friction tool). 

observability normal form is then 

⎛ 

˙x1 

⎝ ˙x2 

˙x3 

⎞ 

⎠ = 

⎛ 

⎝ 0 − Laf 

0 

1 

L 

0 

0 

− 1 

0 0 

⎛ 

J 

0 

⎞ ⎛ 

⎠ ⎝ 

⎜ 

+ ⎜ 

⎝ 

� 

1 

L u(t) x2 

x1 

x2 

x3 

⎞ 

⎠ 

1 

L 

(u(t) − Rx1) 

� 

Laf2x31 − Bx2 − p x2.08 

2 

x1.08 � 

� 

1 

u(t) x3 

⎟ 

� 

⎠ 

x2x3 − Laf1 − Rx3 

x1 x1 . 

x1 − Rx2 − Laf1 x2 � 

2 + x1 

1 

J 

1 

L 

This model needs to be calibrated, i.e., we need to find appropriate values for the parameters 

R, L, B, J, Laf1 ,Laf2 and p. We find the values following a standard approach. First, 

we applied a linear method in order to identify some of the parameters. Second, we used 

a nonlinear optimization technique to obtain more accurate values for the complete set of 

parameters. 

1. From data samples (voltage, current and speed) an initial estimation is obtained via 

the least mean squares technique. 

Let us consider the initial model (not in normal form) at steady state. Suppose that 

L = J = 1 and that the load torque (Tl) is null. With the parameters B, Laf ,p as 

82 

⎞


unknowns (i.e. Laf1 = Laf2 ), the model becomes19 : 

� V 

0 

� 

= 

� I Iωr 0 0 

0 I 2 ωr ω 2.08 

r 


⎛ 

� 

⎜ 

⎝ 

From a series of experiments we collected enough data to constitute three sets of values: 

(V1, ..., VN), (I1, ..., IN) and ((ωr)1, ..., (ωr)N) and found a mean least squares solution 

to the equation above. 

2. At this stage a nonlinear optimization routine was used20 . The solutions found above 

together with L = J = 1, and Laf = Laf1 = Laf2 where taken as the initial values of the 

optimization routine. The cost function to minimize is taken as the distance between 

measured data and predicted trajectory 

� T ∗ 

�I(v) − Ĩ(v)�2 � T ∗ 

dv + α2 �ωr(v) − ˜ωr(v)� 2 dv, 

K(L, R, Laf1 , J, Laf2 , B, p) ↦→ α1 

where 

0 

− I and ωr are some measured data that excite the dynamical modes of the process 

(in a pseudo random binary sequence manner [85]), 

− Ĩ and ˜ωr are the predicted trajectory obtained from the model above using the 

measured input variables and Ĩ(0) = I(0) and ˜ωr(0) = ωr(0). 

− and α1, α2 are weighting factors, that may be used to compensate for the difference 

of scale between the current and voltage amplitudes. 

3. The situation may arise when the solution found by the optimization routine is a local 

minimum. In this case, the solution set of parameters doesn’t match the experimental 

data while the search algorithm stops. A solution would be to slightly modify the 

output of the algorithm and relaunch the search. 

Because of the great number of parameters and the difficulty in analyzing the cost 

function, this re-initialization is rather venturesome. It is difficult to determine which 

parameters shall be modified, and how. We propose to compare the measured data (I 

and ωr) to the predictions ( Ĩ and ˜ωr), using the initial set of parameters, and to find 

p1 (resp. p2) such that p1I ≈ Ĩ (resp. p2ωr ≈ ˜ωr). Then we reuse the initial model in 

order to determine how to modify the parameters, e.g. 

L ˙ Ĩ = V − RĨ − Laf1Ĩ ˜ωr 

R 

Laf 

B 

p 

L(p1 ˙ 

I) =V − R(p1I) − Laf1 (p1I)(p2ωr) 

(p1L) ˙ 

I = V − (Rp1)I − (Laf1 p1p2)Iωr. 

Although non standard, this method gives us new initial values for a new optimization 

search. In practice, this method produced excellent results. 

⎞ 

⎟ 

⎠ . 

19 If we keep Laf1 �= Laf2, the least square solution of the second line is trivially (0, 0, 0) 

20 We kept it simple: the simplex search method (see, for example, [88, 104]). 

83 

0


4.3.4 Implementation Issues 


Once the Live CD is loaded and the computer is running under the Linux distribution, 

the real-time environment has to be activated following three stages: 

1. several RTAI and COMEDI modules are loaded with the following shell commands (Cf. 

[34], page 11) 21 : 

insmod / usr / r e a l t i m e / modules / r t a i h a l . ko} 

insmod / usr / r e a l t i m e / modules / r t a i u p . ko $\ sharp$or r t a i l x r t . ko} 

insmod / usr / r e a l t i m e / modules / r t a i f i f o s . ko} 

insmod / usr / r e a l t i m e / modules / r t a i s e m . ko} 

insmod / usr / r e a l t i m e / modules / rtai mbx . ko} 

insmod / usr / r e a l t i m e / modules / r t a i m s g . ko} 

insmod / usr / r e a l t i m e / modules / r t a i n e t r p c . ko ThisNode =”127.0.0.1”} 

insmod / usr / r e a l t i m e / modules / rtai shm . ko} 

insmod / usr / r e a l t i m e / modules / r t a i l e d s . ko} 

insmod / usr / r e a l t i m e / modules / r t a i s i g n a l . ko} 

insmod / usr / r e a l t i m e / modules / r t a i t a s k l e t s . ko} 

modprobe n i p c i m i o } 

modprobe comedi } 

modprobe kcomedilib } 

modprobe c o m e d i fc} 

insmod / usr / r e a l t i m e / modules / r t a i c o m e d i . ko } 

2. the calibration of input/output signals is taken over by a routine provided with the 

Comedi drivers. The corresponding shell commands are: 

comedi \ c o n f i g /dev/ comedi0 n i \ pcimio 

comedi \ c a l i b r a t e −−no−c a l i b r a t e −S ni6024e . c a l i b r a t i o n 

chmod 666 /dev/comedi {∗} 

During this calibration large amplitude step impulses are sent to the motor. In this 

part of the procedure, it is essential to safety that the power supply BE OFF. 

At the end of the calibration, when the routine stops, the voltage step IS UP (i.e. a high 

amount of voltage is delivered as soon as one turns the power supply on). This problem 

is addressed by implementing a (dummy) program 22 that generates any input signal 

and sends it to the power supply. The corresponding executable is then run for a few 

seconds. At the end of every RTAI-Lab generated program, output signals generated by 

the program are reset to zero. Consider that this program is named end of load.cos. The 

corresponding realtime executable, after compilation, is end of load.exe. The command 

lines for a full and safe calibration are then: 

comedi \ c o n f i g /dev/ comedi0 n i \ pcimio 

comedi \ c a l i b r a t e −−no−c a l i b r a t e −S ni6024e . c a l i b r a t i o n 

chmod 666 /dev/comedi {∗} 

. / e n d of l o a d −f 5 

21 Without Super User rights, those commands are ignored. Employing the sudo command is sufficient. 

22 i.e. A clock, any signal (sine wave or step) and a Digital/Analog block. 

84



3. the RTAI version (i.e. 3.4) that is embedded in the RTAI-Knoppix has an error in 

the application file that links the Scicos Analog/Digital block (Analog to digital) of the 

RTAI-lib palette to the real-time executable end file. The consequence of this error is 

that is that it is impossible to measure more than one signal. After a few discussions 

with the RTAI community, a solution was provided to us by R. Bucher in the form of a 

corrected version of the faulty file: rtai4 comedi datain.sci. The correct code is given in 

Appendix C.1. The files to be replaced are located on the virtual file system UNIONFS 

deployed by the Live CD, in the following repositories: 

/UNIONFS/ usr / s r c / r t a i −3.4/ r t a i −lab / s c i l a b / macros /RTAI/ 

/UNIONFS/ usr / l o c a l / s c i l a b −4.0/ macros /RTAI/ 

It is also necessary to recompile the file in the second repository using the shell command 

s c i l a b −comp r t a i 4 c o m e d i d a t a i n . s c i 

To do this, one need only to copy/paste the code of Appendix C.1 in any text editor, save 

this file with the name rtai4 comedi datain.sci and perform the following commands 

(MY USB KEY has to be replaced with the correct path name): 

sudo rm /UNIONFS/ usr / l o c a l / s c i l a b −4.0/ macros /RTAI/ r t a i 4 c o m e d i d a t a i n . s c i 

sudo rm /UNIONFS/ usr / l o c a l / s c i l a b −4.0/ macros /RTAI/ r t a i 4 c o m e d i d a t a i n . bin 

sudo rm /UNIONFS/ usr / s r c / r t a i −3.4/ r t a i −lab / s c i l a b / macros /RTAI/.. 

r t a i 4 c o m e d i d a t a i n . s c i 

sudo cp MY USB KEY/ r t a i 4 c o m e d i d a t a i n . s c i /UNIONFS/ usr / l o c a l / s c i l a b −4.0/.. 

macros /RTAI/ r t a i 4 c o m e d i d a t a i n . s c i 

sudo cp MY USB KEY/ r t a i 4 c o m e d i d a t a i n . s c i /UNIONFS/ usr / s r c / r t a i −3.4/ r t a i .. 

−lab / s c i l a b / macros /RTAI/ r t a i 4 c o m e d i d a t a i n . s c i 

sudo cd /UNIONFS/ usr / l o c a l / s c i l a b −4.0/ macros /RTAI/ 

s c i l a b −comp r t a i 4 c o m e d i d a t a i n . s c i } 

The real-time platform is now fully functional. 

The I/O signal is sampled sufficiently fast as compared to the motor time constant. We 

implement the continuous version of the adaptive high-gain extended Kalman filter. The 

general structure of the observer is the same as before. It is displayed in Figure 4.26 together 

with the corresponding Scicos diagram: 

− as before the observer consists of two user defined functions that work in quite the same 

way as Matlab S-functions, 

− the update of the observer equations is performed continuously and the computation 

of the innovation is done in the discrete time framework, 

− a delay block of length d guarantees that we have access at time t to the estimated 

state computed at time t − d (with the default value being equal to the initial guess of 

the observer) 

− two Analog/Digital blocks get the measured voltage (control variable of the machine) 

and current (output variable of the machine), 

85



− the two last blocks are gain factors that adjust the measured signals to the appropriate 

scale 23 . 

The implementation of the two main blocks is done with the RTAICblock block that 

appears in the RTAI-lib palette. It is an adaptation of the Cblock2 block of the palette 

Others 24 . 

Generally speaking, Scicos blocks are composed of two files: the interfacing function and 

the computational function. The role of the interfacing function is to link the computational 

function to Scicos. It defines how the computational function has to be interpreted and what 

the appearance of the Scicos block actually is (number of entry points, size and name of the 

block, etc.). The computational function is the core of the block. It defines what the block 

does. Most of the time a flag parameter is used to specify which part of the computational 

function has to be considered. As may be guessed from its name, the computational function 

of this block has to be written in C code. The structure is similar to that of a Matlab 

S-function (see [41], Chpt. 9, in particular Sec. 9.5.2). 

The calculation of a solution for the Riccati equation requires several matrix multiplications. 

We examined the two following approaches: 

1. Scicos is embedded into Scilab, which deals pretty smoothly with the multiplication of 

matrices. Scilab is built from several FORTRAN, C and C++ routines and is open 

source. This means that the original files are serviceable from Scilab source code. In 

order to use those routines in C, the header 

#include 

is required. The two routines we need are 

(a) extern int C2F(dmmul)(); that takes the matrices A, B, C as input parameters 

and outputs the matrix C = A × B, 

(b) extern int C2F(dmmul)(); that takes the matrices A, B, C as input parameters 

and outputs the matrix C = C + A × B, 

Combinations of those two functions enable us to perform all the matrix multiplications 

required. In addition, recall that the Ricatti matrix of Kalman-like filters is square 

symmetric, i.e., for a dim(n × n) matrix, only n(n + 1)/2 integrations (or updates) 

are required. A small program that transforms the square matrix into a corresponding 

column vector and vice versa, must be developed. 

2. the matrices used have particular shapes: 

− A(u) is an upper diagonal matrix, 

− Q and R are taken diagonal, 

− b ∗ (z, u) is lower triangular. 

23 The I/O card delivers a digital input positive signal on the range (0 − 5), while the maximum current 

supported by the DC supply is 10 A. The scaling factor is therefore 2. The scaling factor for the voltage is 60. 

24 The palette Others is a regular Scicos palette. Since those two functions need to interact with the real-time 

routine of the operating system and have to be taken from the RTAI-lib palette. 

86


z(t 

dz/dt= 

dS/dt= 

d!/dt= 

u(t 

d delay 

Innovation 

y(t 


Figure 4.26: Decomposition of the adaptive-gain extended Kalman filter. 

(development diagram and Scicos block of the real-time task) 

Consequently, developing the equations on paper gives us simplified expressions. In 

addition, useless computations are avoided 25 . 

In the case of the DC machine, with a model made up of 3 equations, the second solution is 

definitely the one to use. 

The computation of the innovation requires a solution of the model given in equation 

(4.3.3) over a time window of length d. This may also be done using an external routine, 

lsode, which already exists in Scilab source code. It is the simulator used both by Scilab 

and Scicos to cope with ordinary differential equations. Here again the machine.h header 

and a function call, identical to the ones described above, need to be included (for more 

information on how to use this routine see [73]). Unfortunately, unlike dmmul and dmmul1, 

the use of lsode is not supported by the real-time compiler. We instead implement a fourth 

order Runge-Kutta algorithm (see [104] for example). 

The C code of those two computational functions is given in Appendices C.3 and C.4. 

4.3.5 Experimental Results 

We implemented the adaptive high-gain extended Kalman filter along with 

− a Luenberger high-gain observer, 

− an extended Kalman filter, 

− and a high-gain extended Kalman filter. 

Several simulations were performed in order to find and correct values for the high-gain 

parameters. The parameters were estimated at a low speed (ωr = 180 rad.s −1 ) and the 

observers tested according to the following scenario: 

25 In the present case n = 3 which is rather low and allows us to reasonably develop the equations by hand. 

In the case of systems with much higher dimensions or multiple output systems the use of formal calculation 

softwares such as Mathematica, Mapple or Maxima [2] shall be considered. 

87


− the machine is fed 54V for 30 seconds 

− at t = 30 the input voltage is switched to 42V for another 30 seconds 


− the voltage is then raised to 66V and finally shut down after 30 seconds. 

At each step, for a period of about 10 seconds, an undetermined frictional force is applied 

to the shaft of the motor (those perturbations are applied by hand, which explains why we 

operate at such low speeds). 

Figure 4.27 shows the estimations provided by the Luenberger observer in an open loop 26 . 

The high-gain parameter was set to θ =2.5 and the application was run with a 0.001 seconds 

time sampling. 

Because of the additional computations needed to dynamically solve the Ricatti equation, 

the high-gain Kalman filter is often seen as a very slow observer. Compared to a Luenberger 

filter this is indeed the case. Thus, we ran the real-time task at time samplings of 0.01 

seconds. On the positive side, this observer is more efficient than a Luenberger when dealing 

with systems with a matrix A that depends on the input variable u(t). 

The results presented in Figure 4.28 show the estimation of the rotational speed computed 

by both an extended Kalman filter (θ = 1) and its high-gain counterpart (θ =2.5). Since 

our perturbations are done by hand (and are thus not reproducible), the only way we can 

display such information is to run the two observers in parallel. We therefore feel that it 

is possible to tune the code of the extended Kalman filter to make it run efficiently with a 

sample time of 0.001 seconds. As expected, the non high-gain filter reacts more slowly to the 

applied perturbations. As the measurement noise is not large, it is difficult to distinguish the 

(bad) influence of a large value of θ on the estimation. This influence can still be noticed in 

the time windows [5; 10] and [25; 30]. 

The adaptive-gain extended Kalman filter is more demanding in terms of computational 

time even though this observer runs at the sample time 0.01 seconds. Table 4.2 shows the 

values selected for this experiment. 

The results of this last experiment are displayed in Figure 4.29. Because this run was 

done with no other observer in parallel, a comparison is not easy to make. Still, we can see 

that, when dealing with perturbations, the observer has a speed of convergence comparable 

to that of the high-gain extended Kalman filter and responds in a nature that is as smooth 

as the one of the extended Kalman filter. 

If we take a look at Figure 4.30 we see that the parameter θ reacts 9 times during 

the experiment: once for every modification plus one extra time corresponding to a bad 

initialisation. Those reactions correspond to: 

− measured changes of the input voltage, 

− non-measured changes in the load torque (i.e. perturbations). 

It seems regular that changes occur in the second situation but not in first. In fact, this is 

due to modeling errors which means that the innovation may not vanish. This problem is 

solved by filtering the innovation 27 in the following way: 

26 During perturbations estimation is less precise but stays in a range of 10 to 15% of the real value. 

Remember that we had no sensor to measure the torque precisely when the model was calibrated. 

27 The convergence result can be proven with such a filtering process. Once the time d ∗ of Theorem 36 has 

been set, we have to design the filtering procedure in such a way that I doesn’t vanish in a time less than d ∗ . 

88


Parameter Value Role 

Q diag(1, 10 −1 , 10 −2 ) Filtering 

R 1 Filtering 

θ1 1.25 High-gain 

β 1664 π 

e Adaptation* 

m1 0.005 Adaptation* 

m2 0.004 Adaptation 

λ 100 Adaptation* 

∆T 0.01 Adaptation* 

d 0.1 Innovation 

δ 0.1 Innovation* 

Table 4.2: Final choice of parameters (*: Application-free parameters). 

� ˙If = α(I − If ) 

Iused = I − If 

where α fixes the maximum time that θ will remain fixed at its maximum value. 

4.4 Conclusions 


In this chapter, the adaptive high-gain extended Kalman filter was completely defined. A 

sigmoid function allows us to take care of the influence of measurement noise on the final value 

of innovation. We proposed a clear methodology in order to efficiently tune the parameters of 

the observer. Its performance was compared to those of a pure high-gain and a non high-gain 

observer in simulation. 

In the second part of this chapter, the implementation of the observer, in a hard real-time 

environment, on a simple process has been investigated in detail. The observer’s behavior is 

as efficient as it is in simulations. The compliance to hard real-time constraints showed that 

the effective use of this algorithm is not a mathematician’s mid-summer night’s dream. 

89


!%#$ 

!"#$ 

!&'$ 

!%#$ 

!"#$ 

!&'$ !%#$ !"#$ 


Figure 4.27: Speed estimation using a high-gain extended Luenberger observer. 

(+p): begining of perturbation, (-p): end of perturabtion, (Ic): change of the input. 

!%#$ 

!"#$ 

!&'$ 

!%#$ 

Figure 4.28: Speed estimation using a high-gain extended Kalman filter 


90 

!"#$ 

!&'$ 

!%#$ 

!"#$


!%#$ !"#$ !&'$ 

!%#$ 

!"#$ 

!&'$ !%#$ !"#$ 


Figure 4.29: Speed estimation using the adaptive-gain extended Kalman filter 


#+& 

#+! 

"+& 

"+! 

! "! #! $! %! &! '! (! )! *! "!! 

,-./ 

Figure 4.30: Evolution of theta. 

91


Chapter 5 

Complements 

Contents 

5.1 Multiple Inputs, Multiple Outputs Case . . . . . . . . . . . . . . . 93 

5.1.1 System Under Consideration . . . . . . . . . . . . . . . . . . . . . . 93 

5.1.2 Definition of the Observer . . . . . . . . . . . . . . . . . . . . . . . . 94 

5.1.3 Convergence and Proof . . . . . . . . . . . . . . . . . . . . . . . . . 95 

5.1.3.1 Lemma on Innovation . . . . . . . . . . . . . . . . . . . . . 96 

5.1.3.2 Preparation for the Proof . . . . . . . . . . . . . . . . . . . 98 

5.1.3.3 Intermediary Lemmas . . . . . . . . . . . . . . . . . . . . . 100 

5.1.3.4 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . 101 

5.2 Continuous-discrete Framework . . . . . . . . . . . . . . . . . . . . 103 

5.2.1 System Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 

5.2.2 Observer Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 

5.2.3 Convergence Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 

5.2.4 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 

5.2.5 Preparation for the Proof . . . . . . . . . . . . . . . . . . . . . . . . 108 

5.2.6 Proof of the Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 110 

92


5.1 Multiple Inputs, Multiple Outputs Case 

This chapter contains several additional and complementary considerations, which are 

related to adaptive high-gain observers. The first section provides some insight into the 

multiple outputs case. A generalization of the observer of Chapter 3 is also presented. In the 

second section, we develop an observer for a continuous-discrete system. 

For these two cases, we define the requirements necessary to achieve exponential convergence. 


From a theoretical point of view, the multiple outputs case is harder to handle than the case 

of a single output. Indeed, there is no unique observability form (see [19, 20, 27, 51, 57, 63, 84] 

and the references herein). The multiple outputs case is also more complex in practice since 

the various normal forms lead to different definitions of the observer. In the section below, 

we propose a generalization of the normal form (3.2) together with the definition of the 

corresponding observer. The proof of the convergence of this observer is given in Subsection 

5.1.3. 

Although in a more compact form than in Chapter 3, we keep the proof self contained, 

which implies that we will repeat ourselves to some extent. The modifications of the proof, 

which are specific to the Multiple Inputs/Multiple Outputs(MIMO) case, are denoted by a 

thin vertical line in the left margin. The single output case, which was presented before, is 

included in this generalized version. 

5.1.1 System Under Consideration 

We focus on a blockwise generalization of the multiple input, single output form (3.2) of 

Chapter 3. A similar form has been used in the Ph.D. thesis of F. Viel, [113] for a high-gain 

extended Kalman filter. Another choice has been made in [57] for an even dimension state 

vector. 

− The state variable x(t) resides, as before, within a compact subset χ ⊂ R n , 

− The input variable u(t) resides within a subset Uadm ⊂ R nu , 

− The output vector y(t) is within R ny where ny ≤ 1. 

The system is of the form: 

and the state variable is decomposed as 

� 

dx 

dt = A (u) x + b (x, u) 

y = C (u) x. 

x(t) = � x1(t), ..., xny(t) �′ 

, 

, (5.1) 

where for any i, i ∈ {1, ..., ny}: xi ∈ χi ⊂ Rni ny � 

, χi compact. Therefore n = ni, and each 

element xi(t) is such that: 

xi(t) = � x 1 i (t),x 2 i (t), ..., x ni 

i (t)�′ . 

93 

i=1



The matrices A(u) and C(u) are given by: 

⎛ 

⎞ 

A1(u) 0 . . . 0 

⎜ 0 A2(u) . . . 0 ⎟ 

A = ⎜ 

⎝ 

. 

. .. . .. 

⎟ 

. ⎠ 

0 . . . 0 Any(u) 

, Ai(u) 

⎛ 

0 α 

⎜ 

= ⎜ 

⎝ 

2 i . . . 0 

0 . ⎞ 

. . 

. .. 

⎟ 

0 ⎟ 

. 

. .. . .. 

⎟ ni α ⎠ 

i 

0 . . . 0 0 

and the matrix C(u) is the generalization: 

⎛ 

C1(u) 

⎜ 0 

C = ⎜ 

⎝ 0 

0 

C2(u) 

... 

... 

... 

. .. 

0 

0 

0 

⎞ 

⎟ 

⎠ 

0 0 ... Cny(u) 

, Ci = � α1 i (u) 0 ... 0 � . 

Finally the vector field b(x, u) is defined as: 

⎛ ⎞ 

b1(x, u) 

⎜ 

b(x, u) = ⎜ b2(x, u) ⎟ 

⎝ ... ⎠ 

bny(x, u) 

, bi(x, 

⎛ 

b 

⎜ 

u) = ⎜ 

⎝ 

1 i (x1i ,u) 

b2 i (x1i ,x2i ,u) 

... 

b ni 

i (x,u) 

⎞ 

⎟ 

⎠ . 

Remark 47 

The very last component of each element bi(., .) of the vector field b is allowed to depend 

on the full state. As one can see, the linear part is a Brunovsky form of observability: this 

is clearly a generalization of system (3.2). Nevertheless not every observable system can be 

transformed into this form. 

5.1.2 Definition of the Observer 

In order to preserve the convergence result, apply a few modifications to the observer. There 

are mainly three points to consider 

− the definition of innovation is adapted, rather trivially, to a multidimensional output 

space; 

− the matrix∆ , generated with the parameter θ is not the generalization one would 

imagine initially; 

− and the definition of the matrix Rθ must remain compatible with the matrix∆. 

Let us first recall the equations of the observer: 

⎧ 

⎨ 

⎩ 

dz 

dt = A(u)z + b(z, u) − S−1C ′ 

dS 

dt 

(Cz − y(t)) 

= −(A(u)+b∗ (z, u)) ′ 

S − S(A (u)+b∗ (z, u)) + C ′ 

R −1 

dθ 

dt 


= F(θ, Id (t)) 

where Q and R are symmetric positive definite matrices of dimension (n × n) and (ny × ny) 

respectively. The innovation at time t is: 

� t 

Id(t) = �y(s) − y (t − d, z(t − d),s) � 2 ny R ds 

where: 

t−d 

R −1 

θ 

94



− y (t − d, z(t − d),s) denotes the output of the system of Subsection 5.1 computed over 

the interval s ∈ [t − d; t] with z(t − d) as the initial state, 

− y is the measured output of dimension ny. 

Let us now denote n∗ = max � � 

n1,n2, ..., nny , the size of the largest block, and define the 

matrix: 

⎛ 

1/θ 

⎜ 

∆i = ⎜ 

⎝ 

n∗−ni 0 

0 

1/θ 

... 0 

n∗ . 

−(ni−1) 

. .. 

. .. 

. .. 

. 

0 

0 . . . 0 1/θn∗ ⎞ 

⎟ 

⎠ 

(5.2) 

−1 

and ∆ is given by diag(∆1, ..., ∆ny). The definition of Qθ is the same as before: 

Qθ = θ∆ −1 Q∆ −1 . 

We need to provide a new definition for the matrix Rθ. To do so, we begin with the matrix: 

⎛ 

θ 

⎜ 

δθ = ⎜ 

⎝ 

n∗−n1 0 ... 0 

0 θn∗ . −n2 .. . 

. .. . .. 0 

0 . . . 0 θn∗ ⎞ 

⎟ 

⎠ 

−nny 

, 

and set: 

The initial state of the observer is: 

− z(0) ∈ χ ⊂ R n , 

Rθ = 1 

θ δθRδθ. 

− S(0) ∈ Sn(+), the set of the symmetric positive definite matrices, 

− θ(0) = 1. 

Remark 48 

This definition can also be used for single output systems. Since (ny = 1) ⇒ (n = n1 and 

n ∗ = n) ⇒ (n ∗ − n1 = 0). Therefore: 

− δθ =1, 

− ∆ is the same as in Chapter 3, equation (3.3). 

5.1.3 Convergence and Proof 

The convergence theorems remain as presented in Chapter 3: 

Theorem 49 

For any time T ∗ > 0 and any ε ∗ > 0, there exist 0 < d < T ∗ and a function F (θ, Id) such 

that for any time t ≥ T ∗ : 

�x (t) − z (t)� 2 ≤ ε ∗ e −a (t−T ∗ ) 

where a>0 is a constant (independent from ε ∗ ). 

95



The proof of convergence is quite long. For the sake of simplicity, we divide the proof into 

several parts: 

1. the innovation lemma, 

2. a study of the properties of the Riccati matrix S, 

3. the calculation of some preliminary inequalities (i.e. the preparation for the proof), 

4. the three technical lemmas, 

5. and the main body of the proof. 

Points 2, 4 and 5 are essentially the same as in the single output case. Only points 1 and 3 

require modifications for the multiple output case. The modifications are explained here. 

5.1.3.1 Lemma on Innovation 

Lemma 50 (Lemma on innovation, multiple output case) 

Let x0 1 ,x02 ∈ Rn and u ∈ Uadm. Let us consider the outputs y � 0,x0 1 , ·� and y � 0,x0 2 , ·� 

of system (5.1) with initial conditions respectively x0 1 and x02 . Then the following property 


Proof. 

∀d >0, ∃λ0 d > 0 such that ∀u ∈ L1 b (Uadm) 

� d 

�y � 0,x 0 1, τ � − y � 0,x 0 2, τ � � 2 ny R dτ (5.3) 

�x 0 

1 − x 0 

2� 2 ≤ 1 

λ 0 d 

0 

Let x1 (t) =x x 0 1 ,u (t) and x2 (t) =x x 0 2 ,u (t) the solutions of (5.1) with xi (0) = x 0 i 

A few computations lead to 

b (x2,u) − b (x1,u)=B (t)(x2 − x1) 

� 1 ∂b 

where B(t) = 

0 ∂x (αx2 + (1 − α)x1,u)dα. 

Set ε = x1 − x2. Let us consider the system: 

� 

˙ε = [A (u)+B (t)] ε 

yɛ = C (u) ε = a1 (u) ε1. 

, i =1, 2. 

It is uniformly observable1 due to the structure of B (t). We define the Gramm observability 

matrix, Gd, using Ψ( t), the resolvent2 of this system: 

� d 

Gd = Ψ (v) ′ C ′ CΨ (v) dv. 

1 See [57], for example. Alternatively, one could compute the observability matrix 

φO = 

and check the full rank condition for any input. 

2 Cf. Appendix A.1 

0 

� 

C ′ 

|(CA) ′ 

| . . . |(CA n ) ′�′ 

, 

96



Since �B (t)� ≤Lb, each bi,j(t) can be considered as a bounded element of L∞ [0,d] (R). 

� 

We identify L∞ [0,d] (R) 

�p with L∞ [0,d] (Rp ) where3 We consider the function: 

p = 

ny � 

i=1 

ni(ni − 1) 

2 

+ nyn. 

Λ : L ∞ [0,d] (Rp ) × L ∞ [0,d] (Rnu ) −→ R + 

(bi,j) (j≤i)∈{1,..,n},u↩ → λmin (Gd) 

where λmin (Gd) is the smallest eigenvalue of Gd. Let us endow L ∞ [0,d] (Rp ) × L ∞ [0,d] (Rnu ) 

with the weak-* topology4 and R has the topology induced by the uniform convergence. The 

weak-* topology on a bounded set implies uniform continuity of the resolvent, hence Λ is 

continuous5 . 

Since control variables are supposed to be bounded, 

Ω1 = 

� 

L ∞ [0,d] 

� 

R n(n+1) 

2 

� 

; �B� ≤Lb 


� 

Ω2 = u ∈ L ∞ [0,d] (Rn � 

);�u� ≤Mu 

are compact subsets. Therefore Λ (Ω1 × Ω2) is a compact subset of R which does not contain 

0 since the system is observable for any input. � Thus Gd is never singular. Moreover, for Mu 

sufficiently large, 

includes L∞ [0,d] (Uadm). 

� 

u ∈ L∞ [0,d] (Rn );�u� ≤Mu 

Hence, there exists λ0 d such that Gd ≥ λ0 d Id for any u and any matrix B(t) as above. We 

conclude that � d � 

�y � 0,x 0 1, τ � − y � 0,x 0 2, τ �� 2 dτ ≥ λ 0 � 

�x d 

0 1 − x 0 � 

� 

2 

2 . (5.4) 

0 

3 The matrix B(t) can be divided into ny parts of dimensions (ni × n), i ∈ {1, ..., ny}. For each of those 

parts, the last line may be full (i.e. derivation of the vector field elements of the form bi,n i (x, u) w. r. t. the 

state x). It gives a maximum of nyn elements. 

For each one of the ny parts, the lower triangular part of a square of dimension (ni −1×ni −1) may contain 

non zero elements. That makes ni(ni − 1)/2 elements. 

Therefore the maximum number of non null elements of the matrix B(t) is: 

ny � 

i=1 

ni(ni − 1) 

2 

+ nyn. 

4 The definition of the weak-* topology is given in Appendix A. 

5 This property is explained in Appendix A. 

97 

�


5.1.3.2 Preparation for the Proof 


First, we remind the reader of the change of variables that we performed in Subsection 3.5. 

− ɛ = x − z is the estimation error, 

− ˜x = ∆x, 

− ˜ b(., u) =∆b(∆ −1 ., u), 

− ˜ b ∗ (., u) =∆b(∆ −1 ., u)∆ −1 . 

The definition of the matrix ∆ gives us the following property. 

Lemma 51 

1. The vector field ˜ b(˜x, u) has the same Lipschitz constant as b(x, u). 

2. The matrix ˜ b ∗ (˜x, u) has the same bound as the Jacobian of b(x, u) 

Remark 52 

This lemma is valid for both the definition of Chapter 3 and the definition of Section 5.1.2 

provided above. The proof is given in [57], pg. 215. We reproduce the proof for the multiple 

output case since it allows us to justify the definition of ∆. 

Proof. 

Recall that θ(t) ≥ 1. 

1. Consider a component of ˜b(., u) of the form ˜b k i (., u) with i ∈ {1, ..., ny} and k ∈ 

{1, ..., ni − 1}. From the change of variables ˜b(., u) =∆b(∆−1 ., u) we have: 

˜k 1 

bi (x, u) = 

n∗ − ni + k − 1 b 

� 

θ n∗−ni 1 

xi , θ n∗−ni+1 2 

xi , ...,θ n∗ � 

−ni+k−1 k 

xi ,u . 

We denote Lb as the Lipschitz constant of b(., u) w.r.t. the variable x: 

� 

� 

�˜b k i (x, u) − ˜b k � 

� 

i (z, u) � 

= 

≤ 

1 

θn∗ � 

�b −ni +k−1 

� θn∗−nix1 i , ...,θ n∗−ni+k−1xk i ,u � 

−b � θn∗−niz1 i , ...,θ n∗−ni+k−1zk i ,u �� 

Lb 

θn∗ � 

� 

−ni +k−1 

� θn∗−nix1 i , ...,θ n∗−ni+k−1xk i ,u � 

− � θn∗−niz1 i , ...,θ n∗−ni+k−1zk i ,u �� 

Lb 

≤ 

θn∗−ni +k−1 θn∗−ni+k−1 � � � x1 i , ..., xki ,u� − � z1 i , ..., zk i ,u�� 

= Lb 

� � x1 i , ..., xki ,u� − � z1 i , ..., zk i ,u�� . 

(5.5) 

Therefore the Lipschitz constant of b(., .) is the same in the two coordinate systems. 

Consider now an element of the form ˜b ni 

i (., u). Such an element can be a function of 

the full state. First of all we note that when n ∗ = max 

i∈{1,...,ny} ni we have: 

� � ∆ −1 x − ∆ −1 z � �≤θ n∗ −1 �x − z�. 

98



The matrix ∆ has been defined such that, for all ni, i ∈ {1, ..., ny}: 

This implies that for all ni, i ∈ {1, ..., ny}: 

proving the first part of the lemma. 

ñi 1 

bi (., u) = 

θn∗−1 ˜b ni 

i (., u). 

� ˜ bni (˜x, u) − ˜ bni (˜z, u)� ≤Lb�x − z�, 

2. The situation is simpler for the Jacobian matrix ˜ b ∗ (., u). Consider any element denoted 

˜ b(i,j)(˜z). From the definition of the change of variables there exists ĩ ∈ N and ˜j ∈ N 

(i.e. they can be equal to one) such that: 

˜ b ∗ (i,j) (˜z) = 1 

θ ĩ b∗ (i,j) (˜z)θ˜j 

with ĩ ≥ ˜j (otherwise the element b∗ (i,j) = 0 because of the structure of b(x, u)). Then 

proving the lemma. 

�˜b ∗ (i,j) (˜z)� ≤θ˜j−ĩ ∗ 

�b(i,j) (˜z)� ≤ �b ∗ (i,j) (˜z)� ≤Lb. 

Remark 53 

When ∆ is defined in a different way, the Lipschitz constant of ˜ b isn’t Lb. The constant 

actually depends on the value of θ. This leads to a inconsistent proof since Lb is used in the 

definition of θ1 in the proof of Theorem 49. 

When an observability form distinct from (5.1) is used, ∆ has to be redefined in such a 

way that the condition (5.5) is satisfied. 

We have the following set of identities: 


(a) ∆A = θA∆, (b) A ′ 

∆ = θ∆A ′ 

, 

(c) A∆−1 = θ∆−1A, (d) ∆−1A ′ 

= θA ′ 

∆−1 , 

d 

F(θ,I) 

(e) dt (∆) =− θ N∆, (f) d 

dt 

(g) ∆ −1 C ′ 

R −1 

θ C∆−1 = ∆ −1 C ′ � 1 

� ∆ −1 � = F(θ,I) 

θ N∆−1 , 

θ δθRδθ 

= θC ′ 

R−1C. � −1 C∆ −1 

The matrix N is defined by 

⎛ 

N1 

⎜ 

N = ⎜ 0 

⎜ 

⎝ . 

0 

N2 

. .. 

... 

. .. 

. .. 

0 

. 

0 

⎞ 

⎟ 

⎠ 

0 . . . 0 Nny 

, Ni 

⎛ 

n 

⎜ 

= ⎜ 

⎝ 

∗ − ni 

0 

0 

n 

... 0 

∗ − ni +1 . . 

. .. 

. . 

. .. 

. 

0 

0 . . . 0 n∗ ⎞ 

⎟ 

⎠ 

− 1 

. 

99 

(5.6) 

(5.7)



The dynamics of the error after the change of coordinates are given by: 

d˜ε 

dt 

= θ 

and the Riccati equation becomes 

d ˜ S 

dt 

= θ 

� F(θ,I) 

θ 2 

� 

F(θ, I) 

− 

θ2 N ˜ε + A˜ε − ˜ S −1 C ′ 

R −1 C ˜ε + 1 

� � 

˜b (˜z, u) − ˜b (˜x, u) 

θ 

� 

, (5.8) 

� 

N ˜ S + ˜ SN 

� 

− 

� 

A ′ ˜ S + ˜ SA 

These two equations are used to compute the derivative of �ε ′ � S�ε: 

� 

+ C ′ 

R−1C − ˜ SQ˜ S − 1 

θ ˜ S˜b ∗ (˜z, u) − 1 

θ ˜b ∗′ (˜z, u) ˜ � 

S . 

(5.9) 

d 

� 

�ε 

dt 

′ � 

S�ε � ≤−θqm�ε ′ � � 

S� 2 ′ 

�ε +2�ε �S ˜b (�z, u) − ˜b (�x, u) − ˜∗ b (�z, u) �ε . (5.10) 

The proof of the theorem comes from the stability analysis of this last equation. This analysis 

requires the use of four lemmas, which we state below. 

5.1.3.3 Intermediary Lemmas 

The proofs of Lemmas 54 and 57 can be found in Sections 3.6 and 3.7 respectively. Lemmas 

55 and 56 are proven in Appendix B.2. 

Lemma 54 (Bounds for the Riccati equation) 

Let us consider the Riccati equation (5.8). We suppose that 

� 

� 

− the functions ai (u (t)), ��b ∗ i,j (z,u) 

� 

� 

�, 

− 

� 

� 

� 

� F(θ,I) 

� 

� 

� 

�are smaller than aM > 0 and if ai (u (t)) >am > 0 

θ 2 

− S (0) = S0 is symmetric definite positive, taken in a compact of the form aId ≤ S0 ≤ 

bId, and 

− θ(0) = 1 

Then there exist two constants 0 < α 0,t≥ 0} ⊂ R n be absolutely continuous, and satisfying: 

dx(t) 

dt ≤−k1x + k2x √ x, 


4k2 , x(t) ≤ 4 x (0) e 

2 

−k1t . 

Lemma 56 (Technical lemma two) 

Consider �b (˜z) − �b (˜x) − �b ∗ (˜z)˜ε as in the inequality (3.12) (omitting to write u in ˜ � 

b) and 

� 

suppose θ ≥ 1. Then ��b (˜z) − �b (˜x) − �b ∗ � 

� 

(˜z)˜ε � ≤ Kθn−1 �˜ε� 2 , for some K>0. 

100



Lemma 57 (adaptation function) 

For any ∆T >0, there exists a positive constant M(∆T ) such that: 

− for any θ1 > 1,and 

− any γ1 > γ0 > 0, 

there is a function F (θ, I) such that the equation 

˙θ = F (θ, I (t)) , (5.11) 

for any initial value 1 ≤ θ (0) < 2θ1, and any measurable positive function I (t), has the 

properties: 

1. (5.11) has a unique solution θ (t) defined for all t ≥ 0, and this solution satisfies 1 ≤ 

θ (t) < 2θ1, 

� 

� 

2. � F(θ,I) 

� 

� 

� ≤ M, 

θ 2 

3. if I (t) ≥ γ1 for t ∈ [τ,τ + ∆T ] then θ (τ + ∆T ) ≥ θ1, 

4. while I (t) ≤ γ0, θ (t) decreases to 1. 

5.1.3.4 Proof of the Theorem 

First of all let us choose a time horizon d (in Id (t)) and a time T such that 0 < d < T < T ∗ . 

Let F be a function as in Lemma 57 with∆ T = T − d, and M such that fact 2 of Lemma 57 

is true. Let α and β be the bounds from Lemma 54. 

From the preparation of the proof, inequality (5.10) can be written, using Lemma 54 (i.e. 

using ˜ S ≥ α Id) 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤−αqmθ˜ε ′ � � 

S˜ε ˜ ′ 

(t) + 2˜ε S˜ 

˜b (˜z) − ˜b (˜x) − ˜∗ b (˜z)˜ε . (5.12) 

From (5.12) we can deduce two inequalities: the first one, global, will be used mainly 

when ˜ε ′ ˜ S˜ε (t) is not in a neighborhood of 0 and θ is large. The second one, local, will be used 

when ˜ε ′ ˜ S˜ε (t) is small, whatever the value of θ. 

Using � � 

�� ˜b (˜z) − ˜b (˜x) − ˜∗ � 

b (˜z)˜ε � ≤ 2Lb �˜ε� , 

together with α Id ≤ ˜ S ≤ β Id (Lemma 54), (5.12) becomes the “global inequality” 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤ 

� 


α Lb 

� 

˜ε ′ S˜ε ˜ (t) . (5.13) 

Thanks to Lemma 56 we get the“local inequality” as follows: 

� 

� 

�˜b (˜z) − ˜b (˜x) − ˜b ∗ � 

� 

(˜z)˜ε � ≤ Kθ n−1 �˜ε� 2 . 

101


Since 1 ≤ θ ≤ 2θ1 inequality (5.12) implies 

Since �˜ε� 3 = 

� 3 

�˜ε� 

2� 2 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤ 


≤−αqm˜ε ′ S˜ε ˜ (t)+2K (2θ1) n−1 � � 

� 

�S˜ � 

� �˜ε� 3 . 

� 1 

α ˜ε′ ˜ S˜ε (t) 

� 3 

2 

, it becomes 

˜ε ′ S˜ε ˜ (t) ≤−αqm˜ε ′ S˜ε ˜ 

2K (2θ1) 

(t)+ n−1 β 

α 3 

2 

Let us apply 6 Lemma 55, which states that if 

then, for any t ≥ τ, 

˜ε ′ ˜ S˜ε (τ) ≤ 

α5q2 m 

16 K2 (2θ1) 2n−2 , 

β2 ˜ε ′ ˜ S˜ε (t) ≤ 4˜ε ′ ˜ S˜ε (τ)e −αqm(t−τ) . 

� 

˜ε ′ � 3 

2 

S˜ε ˜ (t) . (5.14) 

Provided there exists a real γ, such that 

� 

1 αε∗ γ ≤ 2n−2 min 

(2θ1) 4 , α5q2 m 

16 K2β 2 

� 

, (5.15) 

then ˜ε ′ ˜ S˜ε (τ) ≤ γ implies, for any t ≥ τ, 

From (5.13) 

˜ε ′ ˜ S˜ε (t) ≤ 

αε ∗ 

(2θ1) 2n−2 e−αqm(t−τ) . (5.16) 

˜ε ′ ˜ S˜ε (T ) ≤ ˜ε ′ ˜ S˜ε (0) e (−αqm+4 β 

α Lb)T , 

and if we suppose θ ≥ θ1 for t ∈ [T, T ∗ ], T ∗ >T, using (5.13) again: 

where 

Now, we choose θ1 and γ for 

˜ε ′ ˜ S˜ε (T ∗ ) ≤ ˜ε ′ ˜ S˜ε (0) e (−αqm+4 β 


α Lb)(T ∗ −T ) 



β 

4 LbT ∗ 


(5.17) 

M0 = sup 

x,z∈X 

ε ′ Sε (0) . (5.18) 

β 

4 LbT ∗ 


≤ γ (5.19) 

and (5.15) to be satisfied simultaneously, which is possible since e−cte×θ1 cte < 

θ 2n−2 for θ1 

1 

sufficiently large. Let us choose a function F as in Lemma 57 with∆ T = T − d and γ1 = λ0 dγ β . 

6 This lemma cannot be applied if we use Qθ and R instead of Qθ and Rθ in the definition of the observer 

as in [38]. This is due to the presence of a F 

θ 

times. 

term that prevents parameters k1 and k2 to be positive for all 

102


5.2 Continuous-discrete Framework 

We claim that there exists τ ≤ T ∗ such that ˜ε ′ ˜ S˜ε (τ) ≤ γ. Indeed, if ˜ε ′ ˜ S˜ε (τ) > γ for all 

τ ≤ T ∗ then because of Lemma 50: 

γ < ˜ε ′ S˜ε ˜ 2 2 β 

(τ) ≤ β �˜ε (τ)� ≤ β �ε (τ)� ≤ 

λ0 Id (τ + d) . 

d 

Therefore, Id (τ + d) ≥ γ1 for τ ∈ [0,T ∗ ] and hence Id (τ) ≥ γ1 for τ ∈ [d, T ∗ ], so we have 

θ (t) ≥ θ1 for t ∈ [T, T ∗ ], which results in a contradiction (˜ε ′ ˜ S˜ε (T ∗ ) ≤ γ) because of (5.17) 

and (5.19). 

Finally, for t ≥ τ, using (5.16) 

and the theorem is proven. 

�ε (t)� 2 ≤ (2θ1) 2n−2 �˜ε (t)� 2 

2n−2 

(2θ1) 

≤ α 

≤ ε∗e−αqm(t−τ) , 

˜ε ′ ˜ S˜ε (t) 

(5.20) 

Remark 58 

Just as in the single output case, an alternative result can be derived. Refer to Remark 

44, Chapter 3. 


In the present section we develop an adaptation of the observer to continuous discrete 

systems. In such systems, the evolution of the state variables is described by a continuous 

process, and the measurement component of the system by a discrete function. When we 

cannot use the quasi-continuity assumption of measurements, as we did before, this version 

is the one to use. 

In engineering sciences, when pure discrete filters are used, the discrete model needed is 

sometimes obtained from a continuous formulation. The model 

is transformed into: 

˙x = f(x(t),u(t),t) 

x(t ∗ + δt) =x(t ∗ )+f (x(t ∗ ),u(t ∗ ),t ∗ ) δt, 

where δt represents the sampling time of the process. This equation represents nothing 

more than Euler’s numerical integration method. In other words, this modeling technique is 

a special case of the continuous-discrete framework. The mechanization equation of inertial 

navigation systems 7 is an example of such modeling, details of which can be found in [10, 114] 

for example. 

Although we directly consider a single output continuous-discrete normal form, we want 

to draw the reader’s attention to the problem of the preservation of observability under sampling. 

In [15], the authors prove that for a continuously observable 8 system, observability is 

7 

The mechanization process merges the data coming from 3-axis accelerometers to those coming from a 

gyroscope. 

8 

Uniformly observable for a class of input functions, and uniformly infinitesimally observable in the sense 

of the definitions given in Chapter 2. 

103



preserved provided that the sample time δt is small enough. (Both examples and counterexamples 

can be found in this article (see also [14] on the same topic)). 

Let us state the main theorem of [15]. 

Theorem 59 

Assume that a nonlinear system is observable for every input u(.) and uniformly infinitesimally 

observable 9 , then for all M>0, there exists a δ0 > 0 such that the associated 

δ−sampled system is observable for all δ ≤ δ0 and all M, D−bounded input u δ . 

5.2.1 System Definition 

Let us consider the continuous-discrete version of the multiple input, single output system of 

Chapter 3 (Equation 3.2): 

� 

dx 

dt = A(u(t))x + b(x(t),u(t)) 

(5.21) 

yk = Cxk 

where 

− δt is the constant sampling time of the measurement procedure, 

− x (t) ∈ χ ⊂ R n , χ compact, and xk = x(kδt), k ∈ N, 

− u(t) ∈ Uadm ⊂ R nu bounded, and uk = u(kδt), k ∈ N, 

− y (t) ∈ R, and yk = y(kδt), k ∈ N. 

The matrices A (u) and C (u) are defined by: 

⎛ 

0 

⎜ 

A(u) = ⎜ 

. 

⎝ 

a2 (u) 

0 

0 

a3 (u) 

. .. 

· · · 

. .. 

. .. 

0 

⎞ 

0 

⎟ 

. ⎟ 

0 ⎟ 

an (u) ⎠ 

0 · · · 0 

C = � 1 0 · · · 0 � 

with 0



Remark 60 

Notice that the matrix C is defined differently than in the previous systems. Details lie in 

Appendix B, Remark 94 in particular. 

5.2.2 Observer Definition 

In the continuous-discrete setting the observer is defined by: 

1. a set of prediction equations for t ∈ [(k − 1)δt,kδt[, 

2. a set of correction equations at times t = kδt. 

In the following we use the notations: 

− z(t) is the estimated state for all t ∈](k − 1)δt,kδt[, 

− zk(−) is the estimated state at the end of a prediction period, 

− zk(+) is the estimated state after a correction step (i.e. at the beginning of a new 

prediction period). 

The prediction equations for t ∈ [kδt, (k + 1)δt[, with initial values zk−1(+), Sk−1(+) 

are 

� 

˙z 

˙S 

= 

= 

A(u)z + b(z, u) 

− (A(u)+b∗ (z, u)) ′ 

S − S (A(u)+b∗ (z, u)) − SQθS 

(5.22) 

where S0 is a symmetric definite positive matrix taken inside a compact subset of the form 

aId ≤ S0 ≤ bId. 

The correction equations are: 

⎧ 

zk(+) = zk(−) − Sk(+) −1C ′ 

where 

⎪⎨ 

⎪⎩ 

Sk(+) = Sk(−)+C ′ 

�i=d 

r −1 

θ Cδt 

Ik,d = �yk−i − ˆyk−i� 

i=0 

2 

θk = F (θk−1, Ik,d) 

− x0 and z0 belongs to χ, a compact subset of R n , 

− θ(0) = θ0 = 1, 

− r and Q are symmetric definite positive matrices 10 : 

– Qθ = θ∆ −1 Q∆ −1 , 

– rθ = 1 

θ r, 

r −1 

θ δt(Czk(−) − y) 

10 r is written in capital letters to emphasize the fact that the system is single output. 

105 

(5.23)


with ∆= diag � {1, 1 

θ , ..., 1 

θ n−1 } � , 


− the innovation, Ik,d (d ∈ N ∗ ), is computed over the time window [(k − d)δt; kδt] with: 

– yk−i denotes the measurement at epoch k − i, and 

– ˆyk−i denotes the output of system (5.21) at epoch k − i with z ((k − d)δt) as initial 

value at epoch k − d. 

Remark 61 

In [24] we presented a different version of this observer. In this paper, the adaptation of 

the high-gain parameter was determined during the prediction steps via a differential equation. 

We changed our strategy with respect to the peculiar manner in which the continuous discrete 

version of the observer works. 

Recall that the estimation is a sequence of continuous prediction periods followed by discrete 

correction steps when a new measurement/observation is available. Consequently, since 

innovation is based on the measurements available, it is computed at the correction steps. 

Suppose now that θ is adapted via a differential equation. It starts to change during the 

prediction step following the computation of a large innovation value, and reaches θ1 after 

some time. If alternatively θ is adapted at the end of the correction step, directly after the 

computation of innovation, then it reaches θ1 much faster. 

We opt for the strategy in which θ is adapted at the end of the correction step 11 . 

An advantage brought about by this approach is that now we may remove one of the 

assumptions on the adaptation function: “there exists M such that | F 

θ 2 | 0 and any ɛ ∗ > 0, there exist 

− two real constants µ and θ1, 

− d ≥ n − 1 ∈ N ∗ , and 

− an adaptation function F(θ, I), 

such that for all δt sufficiently small (i.e. 2θ1δt


− let θ increase when innovation is large, 

− have θ decrease toward 1 when innovation is small. 


In the continuous-discrete setting, an extra hypothesis, related to the sampling time, appears 

(see Lemma 63 below). As a consequence the beginning of the proof of Subsection 5.2.6 is 

performed independently from δt. 

5.2.4 Innovation 

The lemma for innovation in the continuous-discrete case is: 

Lemma 63 

Let x0 , ξ0 ∈ Rn � 

and u ∈ Uadm. Let us consider the outputs yj 0,x0 � � 

and yj 0, ξ0 � of 

system (5.21) with initial conditions x0 and ξ0 respectively. Then the following condition 


∀d ∈ N ∗ large enough (i.e. d ≥ n − 1), ∃λ 0 d > 0 such that ∀u ∈ L1 b (Uadm) 

� 

�x0 − ξ 0 

�2 ≤ 1 

λ0 i=d 

d 

i=0 

� 0 

�yi 0,x � � 0 

− yi 0, ξ � � 2 . 

Proof. 

Let x (t) and ξ (t) be the solutions of the first equation of system (5.21) with x 0 and ξ 0 

as initial values. We denote the controls by u(t). 

As in the proof of Lemma 33 we have: 

b(ξ,u) − b(x, u) =B(t)(ξ − x) (5.24) 

where B (t) = (bi,j) (i,j)∈{1,..,n} is a lower triangular matrix since b (x, u) is a lower triangular 

vector field. Set ε = x − ξ. Then 

˙ε = [A (u)+B (t)] ε. (5.25) 

We consider the system formed by equation (5.25) and C (uk) εk = a1 (uk) ε1,k as output. Let 

us consider Ψ( t), the resolvent (5.25), and the Gramm observability matrix 

�i=d 

Gd = Ψ (iδt) ′ 

i=0 

C ′ 

iCiΨ (iδt) . 

From the lower triangular structure of B(t), the upper triangular structure of A(u) and the 

form of the matrix Ci = C(u(iδt)), we can deduce that Gd is invertible 12 when d ≥ n − 1. It 

12 ψ0(.) is the resolvent of system (5.25) (refer to Appendix A.1). The Gramm observability matrix can be 

written: 

⎛ 

⎜ 

Gd = ⎜ 

⎝ 

C0 

C1Ψ0(kδt) 

... 

CdΨ0(dδt) 

⎞′ 

⎛ 

⎟ ⎜ 

⎟ ⎜ 

⎟ ⎜ 

⎠ ⎝ 

C0 

C1Ψ0(kδt) 

... 

CdΨ0(dδt) 

⎞ 

⎟ 

⎠ 

= φ′ φ. 

Therefore Gd is invertible provided it is of rank n, which can be achieved for d ≥ n − 1. 

107



is, therefore, also symmetric positive definite. We consider the same function as before (Cf. 

Lemma 33): 

Λ : L∞ � 

[0,d] R n(n+1) � 

2 × L∞ [0,d] (Rnu ) −→ R + . 

With the same reasoning as in Lemma 33, we deduce the existence of a scalar λ0 d > 0 such that 

Gd ≥ λ0 d Id for any u and any matrix B(t) having the structure specified above. Therefore: 

�i=d 

� 

�yi i=0 

5.2.5 Preparation for the Proof 

� 0 

0,x � � 0 

− yi 0, ξ �� 2 = � x 0 − ξ 0�′ � 0 0 

Gd x − ξ � 

≥ λ 0 � 

� 0 0 

d x − ξ � �2 . 

In order to establish the preliminary inequalities needed in the proof, we first recall the 

matrix inversion lemma: 

Lemma 64 (matrix inversion lemma [67], Section 0.7.4) 

If M is symmetric positive definite, and λ > 0 then 

(M + λMC ′ 

CM) −1 = M −1 − C ′ 

(λ −1 + CMC ′ 

) −1 C. 

The estimation error is denoted ɛ(t) =z(t) − x(t), and we consider the change of variables: 

− ˜x = ∆x, ˜z = ∆z and ˜ɛ = ∆ɛ, 

− ˜ b(., u) =∆b(∆ −1 ., u), and ˜ S = ∆ −1 S∆ −1 , 

− ˜ b ∗ (., u) =∆b ∗ (∆ −1 ., u)∆ −1 . 

As before, the Lipschitz constant of the vector field remains the same in the new system of 

coordinates (Cf. Lemma 51). 

The error dynamics are given by 

in the continuous case, and 

˙ɛ = A(u)ɛ +(b(z, u) − b(x, u)) (5.26) 

ɛk(+) = ɛk(−) − δtS −1 (+)C ′ 

r −1 

θ Cɛk(−) (5.27) 

in the discrete case. 

We highlight the relations below as they are useful for the following computations: 

where N = diag ({0, 1, . . . , n − 1}). 

∆A(u) =θA(u)∆, 

∆−1A ′ 

(u) =θA ′ 

(u)∆−1 , 

Remark 65 

Notice that on intervals of the form [kδt, (k + 1)δt[, the derivative of θ is equal to zero. 

108 

(5.28)


For t ∈ [kδt;(k + 1)δt[, 


d ˜ S 

dt 

d˜ɛ 

dt 

� � 

= θ − A(u)+ 1 

θ ˜b ∗ �′ 

(z, u) 


� 

= θ A(u)˜ɛ + 1 

� � 

˜b(˜z, u) − b(˜x, u) 

θ 

� 

, (5.29) 

˜S − ˜ � 

S A(u)+ 1 

θ ˜b ∗ � 

(z, u) − ˜ SQ˜ � 

S . (5.30) 

We now consider the Lyapunov function ˜ɛ ′ ˜ S˜ɛ and use identities (5.29, 5.30) to obtain the 

equality below: 

� 

d ˜ɛ ′ � 

S˜ɛ ˜ � 

2 

dt = θ θ ˜ɛ′ � � 

S˜ 

˜b(˜z, u) − ˜b(˜x, u) − ˜b ∗ (˜z, u)˜ɛ − ˜ɛ ′ � 

SQ ˜ S˜ɛ ˜ . 

(5.31) 

Similarly, at time kδt, 

� 

˜ɛk(+) = Id − θδt ˜ S −1 ′ 

k (+)C r−1 � 

C ˜ɛk(−), (5.32) 

and, 

˜Sk(+) = ˜ Sk(−)+θδtC ′ 

r −1 C. (5.33) 

As we did for the differential equations, we use (5.32) and (5.33) to compute the Lyapunov 

function at time kδt: 

� 

˜ɛ ′ � 

S˜ɛ ˜ (+) = ˜ɛ′ k 

k (−) 

� 

Id − θδt ˜ S −1 

k 

� 

From (5.33), we replace 

(+)C ′ 

× 

= ˜ɛ ′ 

k (−) 

� 

˜Sk(+) − 2θδtC ′ 

r−1C r−1 �′ 

C 

Id − θδt ˜ S −1 

k 

˜Sk(+) 

(+)C ′ 

r −1 C 

� 

˜ɛk(−) 

+(θδt) 2 C ′ 

r −1 C ˜ Sk(+) −1 C ′ 

r −1 C 

� 

θδtC ′ 

r−1 � � 

C with ˜Sk(+) − ˜ � 

Sk(−) : 

� 

˜ɛ ′ � 

S˜ɛ ˜ (+) = ˜ɛ′ 

k 

From equation (5.33) we write: 

S −1 

k 

k (−) 

= ˜ɛ ′ 

k (−) 

� 

˜Sk(−) ˜ Sk(+) −1 � 

Sk(−) ˜ 

� ˜Sk(−) −1 ˜ Sk(+) ˜ Sk(−) −1 

−1 

(−)Sk(+)Sk (−) =S−1 

k (−)+θδt 

and compute [S −1 

k 

This results in: 

� 

˜ɛ ′ � 

S˜ɛ ˜ 

k (+) = ˜ɛ′ � Sk(−) ˜ − C ′ 

( r 

θδt + C ˜ S −1 

= 

k 

� 

˜ɛ ′ � 

S˜ɛ ˜ (−) − ˜ɛ′ k (−) 

� 

C ′ 

( r 

r S−1 

k 

(−)C ′ 

˜ɛk(−) 

� −1 

˜ɛk(−). 

� 

˜ɛk(−). 

(5.34) 

(5.35) 

CS −1 

k (−), (5.36) 

(−)Sk(+)S −1 

k (−)]−1 by using Lemma 64 with λ = θδt 

r and M = ˜ S −1 

k (−). 

k 

109 

′ 

(−)C ) −1 � 

C ˜ɛ 

θδt + C ˜ S −1 

k 

(−)C ′ 

) −1 C 

� 

˜ɛk(−). 

(5.37)



As in Section 3.5, we need to write the prediction-correction Riccati equations in a different 

time scale (τ), so that we can bound the Riccati matrix independently from θ. We consider 

dτ = θ(t)dt or equivalently τ = � t 

0 θ(v)dv and keep the notation ¯x(τ) = ˜x(t). 

⎧ 

⎨ d 

⎩ 

¯ � 

S 

dτ = − A(u)+ ˜b ∗ �′ � 

(z,u) 

S − S A(u)+ 

θ 

˜b ∗ � 

(z,u) 

− SQS 

θ 

¯Sk(+) = ¯ Sk(−)+θδtC ′ 

r−1 (5.38) 

C. 

Since θ(t) varies within an interval of the form [1, θmax], the instants tk = kδt, k ∈ N are 

difficult to track in the τ time scale. For convenience, tk = kδt is denoted τk in the τ time 

scale. 

With the help of this representation we are able to derive the following lemma: 

Lemma 66 

Let us consider the prediction correction Riccati equations (5.38), and the assumptions: 

� 

� 

− the functions ai (u (t)), �˜b ∗ i,j (z,u) 

� 

� 

�, are smaller than aM > 0, 

− ai (u (t)) ≥ am > 0, i =2, ..., n, 

− θ(0) = 1, and 

− S(0) is a symmetric positive definite matrix taken from a compact subset of the form 

aId ≤ S(0) ≤ bId. 

Then, there exists a constant µ, and two scalars 0 < α 0, and ε ∗ as in Theorem 62. 

Let us now set a time T such that 0 < T < T ∗ . 

Let α and β be the bounds of Lemma 66. 

For t ∈ [kδt;(k + 1)δt[, inequality (5.31) can be written (i.e. using αId ≤ ˜ S), 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤−αqmθ˜ε ′ � � 

′ 

S˜ε ˜ (t) + 2˜ε ˜S ˜b (˜z) − ˜b (˜x) − ˜∗ b (˜z)˜ε 

110 

(5.39)



with qm > 0 such that qmId < Q (and omitting to write the control variable u). 

From (5.39) we can deduce two bounds: the first bound, the local bound, will be useful 

when ˜ε ′ ˜ S˜ε (t) is small independent of the value of θ. The second bound, the global bound, 

will be useful mainly when ˜ε ′ ˜ S˜ε (t) is not in the neighborhood of 0. 

Global bound: Starting from: 

� 

� 

�˜b (˜z) − ˜b (˜x) − ˜b ∗ � 

� 

(˜z)˜ε � ≤ 2Lb �˜ε� , 

together with α Id ≤ ˜ S ≤ β Id (Lemma 66), (5.39) becomes 

d˜ε ′ ˜ S˜ε (t) 

dt 

≤ 

� 


α Lb 

� 

˜ε ′ S˜ε ˜ (t) . (5.40) 

Local bound: Using Lemma 56 

� 

� 

�˜b (˜z) − ˜b (˜x) − ˜b ∗ � 

� 

(˜z)˜ε � ≤ Kθ n−1 �˜ε� 2 , 

which because 1 ≤ θ ≤ 2θ1, implies that 

The fact that �˜ε� 3 = 

d˜ε ′ ˜ S˜ε (t) 

dt 

� 3 

�˜ε� 

2� 2 

≤−αqm˜ε ′ S˜ε ˜ (t)+2K (2θ1) n−1 � � 

� 

�S˜ � 

� �˜ε� 3 . 

≤ 

� 1 

α ˜ε′ ˜ S˜ε (t) 

� 3 

2 

, allows us to write 

˜ε ′ S˜ε ˜ (t) ≤−αqm˜ε ′ S˜ε ˜ 

2K (2θ1) 

(t)+ n−1 β 

α 3 

2 

Let us apply Lemma 55: If there exists ξ such that 

then for any kδt ≤ ξ ≤ t ≤ (k + 1)δt 

If γ ∈ R such that 

then ˜ε ′ ˜ S˜ε (ξ) ≤ γ implies 

˜ε ′ ˜ S˜ε (ξ) ≤ 

α5q2 m 

16 K2 (2θ1) 2n−2 , 

β2 ˜ε ′ ˜ S˜ε (t) ≤ 4˜ɛ ′ ˜S˜ε (ξ)e −αqm(t−ξ) . 

� 

˜ε ′ � 3 

2 

S˜ε ˜ (t) . (5.41) 

� 

1 αε∗ γ ≤ 2n−2 min 

(2θ1) 4β , α5q2 m 

16 K2β 2 

� 

, (5.42) 

˜ε ′ ˜ S˜ε (t) ≤ 

αε ∗ 

β (2θ1) 2n−2 e−αqm(t−ξ) . (5.43) 

Given any arbitrary value of δt, there exists kT ∈ N such that T ∈ [kT δt;(kT + 1)δt[. From 

the global bound (5.40), with θk ≥ 1, for all k ∈ N: 

˜ε ′ ′ β 

S˜ε ˜ (T ) ≤ ˜ε ˜S˜ε (kT δt)e (−αqm+4 α Lb)(T −kT δt) 

. 

111


When we consider t ∈ [kδt, (k+1)δt[, this requirement means that 

We know from (5.37) that in full generality 

� 

˜ɛ ′ � 

S˜ɛ ˜ (+) ≤ 

Thus 

and since 

˜ε ′ ˜ S˜ε (T ) ≤ 

k 


� 

˜ɛ ′ � � 

S˜ɛ ˜ (kδt) = ˜ɛ ′ � 

S˜ɛ ˜ 

k (+). 

� 

˜ɛ ′ � 

S˜ɛ ˜ (−). (5.44) 

k 

� 

˜ε ′ � 

β 

S˜ε ˜ (−)e (−αqm+4 α 

kT 

Lb)(T −kT δt) 

, 

� 

˜ε ′ � 

S˜ε ˜ (−) is the end value of the equation (5.31) for t ∈ [(kT − 1)δt; kT δt[, then: 

kT 

� 

˜ε ′ � � 

S˜ε ˜ (−) ≤ ˜ε 

kT 

′ � 

β 

S˜ε ˜ (+)e(−αqm+4 α 

kT −1 Lb)δt . 

We can therefore, iteratively, independently from δt, obtain the inequality: 

˜ε ′ ˜ S˜ε (T ) ≤ ˜ε ′ ˜S˜ε (0) e (−αqm+4 β 

α Lb)T . (5.45) 

We now suppose that θ ≥ θ1 for t ∈ [T, T ∗ ], T ∗ ∈ [ ˜ kδt, ( ˜ k + 1)δt[ and use (5.40): 

˜ε ′ � � 

S˜ε ˜ ∗ ′ β 

(T ) ≤ ˜ɛ ˜S˜ε ˜kδt e (−αqmθ1+4 α Lb)(T ∗−˜ kδt) 

. (5.46) 

As before, independent of δt, we obtain: 

where M0 = sup ε 

x,z∈χ 

′ 

Sε (0). 

˜ε ′ S˜ε ˜ (T ∗ ) ≤ 

′ 

˜ɛ ˜S˜ε (T )e (−αqmθ1+4 β 

α Lb)(T ∗−T ) 

, 

≤ ˜ε ′ S˜ε ˜ (0) e (−αqm+4 β 

α Lb)T 


α Lb)(T ∗−T ) 

, 

β 

4 LbT ∗ 



Now, we choose θ1 and γ such that 


(5.47) 

β 

4 LbT ∗ 


≤ γ (5.48) 

and (5.42) are satisfied simultaneously, which results because e −cte×θ1 < cte 

θ 2n−2 

1 

for θ1 suffi- 

ciently large. 

We check that the condition 13 2θ1δt θ1. The adaptation function must have the following features: 

13 We arbitrarily set θmax =2θ1. 

112


− θ is such that 1 ≤ θk ≤ 2θ1, for all k ∈ N, 

− Set γ1 = λ0 d γ 

β , and 0 ≤ γ0 ≤ γ1: 

Remark 67 


– suppose, for any arbitrary j ∈ N, Id,k > λ0 d γ 

β in the time interval [jδt,jδt+(T −dδt)] 

then θ (jδt +(kT − d)δt)) > θ1. 

– when Id, k "1 

˜ɛ ′ ˜ S˜ɛ(ξ)e −αqm(t−ξ) 

(5.49)



Remark 68 

As for the continuous case, the convergence proof can be extended to the multiple output 

case following the instructions of Section 5.1. The only difference lies in the usage of Lemma 

64 with equation (5.36), where we cannot write θδt 

r 

since R is no longer a scalar. Since R is 

definite positive, we can assume the existence of a scalar ˜r such that R>˜rId. Then (5.36) 

can be written: 

S −1 

k 

and Lemma 64 can be used. 

−1 

(−)Sk(+)Sk (−) ≤ S−1 

k (−)+θδt 

114 

˜r S−1 

k 

(−)C ′ 

CS −1 

k (−),


Chapter 6 

Conclusion and Perspectives 

The work described in this thesis deals with the design of an observer of the Kalman type 

for nonlinear systems. 

More precisely, we considered the high-gain formalism and proposed an improvement of 

the high-gain extended Kalman filter in the form of an adaptive scheme for the parameter at 

the heart of the method. Indeed, although the high-gain approach allows us to analytically 

prove the convergence of the algorithm in the deterministic setting, it comes with an increased 

sensitivity to measurement noise. We propose to let the observer evolve between two end 

point configurations, one that rejects noise and one that makes the estimate converge toward 

the real trajectory. The strategy we developed here allowed us to analytically prove this 

convergence. 

Observability theory constitutes the framework of the present study. Thus, we began this 

thesis by providing a review and some insight into the main results of the theory of [57]. We 

also provided a review of similar adaptive strategies. In this introduction and background 

review, we stated that the main concern of the thesis would be theoretically proving that the 

observer is convergent. 

The observer has been described in Chapters 3 and 5. It was initially described in the 

continuous setting, and afterwards extended to the continuous-discrete setting. The adaptive 

strategy was also explained in those chapters. This strategy is composed of two elements: 

1. a measurement of the quality of the estimation, and 

2. an adaptation equation. 

The quality measurement is called innovation or innovation for an horizon of length d. It 

is slightly different than the usual concept of innovation. The major improvement provided 

by our definition is a proof that shows that innovation places an upper bound on the past 

estimation error (the delay equals the parameter d above mentioned). This fact is a corner 

stone of the overall convergence proof. 

The second element of the strategy is the adaptation equation that drives the high-gain 

parameter. A differential equation was used in the continuous setting, and a function in 

the continuous-discrete setting (i.e. the adaptation is performed at the end of the update 

procedure). The sets of requirements for those two applications have been proposed such that 

several adaptation functions can be conceived. The set of possible applications isn’t void, as 

we demonstrated by actually displaying an eligible function. 

115


The full proof of convergence has been developed in the single output continuous setting, 

and afterwards extended to the multiple output and continuous-discrete settings. 

A second major concern of this work, was the applicability of the observer. We therefore 

extensively described its implementation on a single output system: a series-connected DC 

machine. The time constraints where investigated via experiments performed using a real 

motor in a hard real-time environment. The testbed was described in Chapter 4 and the 

compatibility with real-time constraints assessed. 

We conclude this work with some ideas for future investigations. 

The Luenberger Case 

It is much more direct to prove the convergence of a high-gain Luenberger observer. 

This is because of the absence of the Riccati equation. However, it prevents us from 

providing a local result of convergence when θ = 1. Therefore the adaptation strategy 

has to be different from that used here (Cf. [11]), or performed for a specific class of 

nonlinear systems (see for example [19]). 

Automatic code generation 

The implementation procedure for this algorithm is now well known. It can be roughly 

classified into two parts: 1) coding specific to the model, and 2) coding related to the 

observer mechanisms. It would be interesting to create a utility that automatically 

generates the code of the observer once the model has been provided. We could save 

development time, and implementation errors cause by typos. 

Dynamic output stabilization 

Dynamic output stabilization is considered in the second part of [57]. An extension of 

this work to a closed loop containing an adaptive observer is a natural development. 

This may be accomplished because the observer presented here is an exponential observer. 

Since θ is allowed to increase when convergence is not achieved, we can expect to 

deliver a good estimate to the control algorithm. The ability to quickly switch between 

modes will be important. 

Cascaded systems 

Let us consider an observable nonlinear cascaded system of the form: 

⎧ 

⎨ 

⎩ 

˙x = f(x, u), 

˙ξ = g(x,ξ ), 

y = h(x,ξ ). 

One could imagine a situation where the state variable x is well known or estimated, 

but not the variable ξ. Does the θ parameter of the observer really need to be high 

for the part of the estimation that corresponds to x? We could consider a high-gain 

observer with two varying high-gain parameters. 

Unscented Kalman filter 

The unscented Kalman filter is a derivative free, nonlinear observer that has received a 

lot of attention recently [71]. This observer is based on the unscented transformation: 

116


a method to calculate the statistics of a random variable which undergoes a nonlinear 

transformation [112]. Continuous and continuous-discrete versions of this observer 

have been proposed in [108]. The idea is to study, the extent to which the high-gain 

formalism, and consequently adaptive high-gain mechanisms can be embedded into this 

observer. 

117


118


Appendices 

119


120


Appendix A 

Mathematics Reminder 

Contents 

A.1 Resolvent of a System . . . . . . . . . . . . . . . . . . . . . . . . . . 122 

A.2 Weak-∗ Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 

A.3 Uniform Continuity of the Resolvant . . . . . . . . . . . . . . . . . 125 

A.4 Bounds on a Gramm Matrix . . . . . . . . . . . . . . . . . . . . . . 128 

121


A.1 Resolvent of a System 

A.1 Resolvent of a System 

In this subsection, we recall basic concepts from the theory of linear differential equations. 

Details can be found in [106]. 

A first order system of linear differential equations is given by: 

where 

1. t ∈ I, an interval of R, 

2. x ∈ R n , 

dx 

dt 

3. A(t) is a t dependent matrix of dimension (n × n), 

4. b(t) is a t dependent vector field of dimension n. 

= A(t)x(t)+b(t) (A.1) 

First, remind, that provided that the applications A(t) and b(t) are continuous then for 

all s ∈ I and for all x0 ∈ R n , this equation has a unique solution on I such that x(s) =x0. 

We now consider the associated homogenous equation: 

dx 

dt 

= A(t)x(t). (A.2) 

Let us denote by x(t, s, x0) the solution of (A.2) at time t with initial condition x(s, s, x0) =x0. 

We define the application: 

ξ ↦→ x(t, s,ξ ), (A.3) 

that associates to any element ξ of R n the solution of (A.2) starting from ξ. 

Let c1,c2 be two positive scalars and ξ1, ξ2 two elements of R n . Consider the trajectory 

c1x(t, s,ξ 1)+c2x(t, s,ξ 2). For t = s, we have: 

c1x(s, s,ξ 1)+c2x(s, s,ξ 2) =c1ξ1 + c2ξ2 = x(s, s, c1ξ1 + c2ξ2). 

From the unicity of solutions, we conclude that 

c1x(t, s,ξ 1)+c2x(t, s,ξ 2) =x(t, s, c1ξ1 + c2ξ2). 

That is to say that for all t and s in I, (A.3) is linear. Therefore there is a t and s dependent 

matrix such that: 

x(t, s, x0) =φ(t, s)x0, 

with φ(s, s) = Id. This matrix is called the resolvent 1 of (A.2). The resolvent has the 

following properties: 

Theorem 69 

1. φ(t, s) is linear w.r.t. the variables s and t, 

1 φ(s, s) =Id is also said to be the resolvent of the equation (A.1). 

122


2. φ(s, s) =Id, 

3. for all τ ∈ I, φ(t,τ )φ(τ,s)=φ(t, s), 

4. φ −1 (t, s) =φ(s, t), 

5. ∂ 

∂ 

dtφ(t, s) =A(t)φ(t, s) and dsφ(t, s) =−φ(t, s)A(s), 

6. in particular: ∂ 

dt φ−1 (t, s) ′ 

= −A ′ 

(s)φ −1 (t, s) ′ 

, 

A.2 Weak-∗ Topology 

7. the solution of (A.2) at time t with, x(s) =x0 is given by: x(t) =φ(t, s)x0, 

8. the solution of (A.1) at time t with, x(s) =x0 is given by: 


� t 

x(t) =φ(t, s)x0 + φ(t, v)b(v)dv. 

Let E be a Banach space 2 and denote its norm by �.�E. E ∗ is the dual vector space 3 of 

E, and E ∗∗ is the dual vector space of E ∗ : the bi-dual space of E. A dual space is embedded 

with the dual norm defined as: 

�f�E ∗ = sup 

x∈E,�x�≤1 

|f(x)|. 

Therefore E ∗ has the topology induced by the dual norm. The object of this section is to 

give the definition of two additional topologies that can be built on the space E ∗ : the weak 

and the weak-∗ topologies. The topology described above, induced by the dual norm, is also 

called strong topology. 

For f ∈ E ∗ and x ∈ E we denote < f, x > instead of f(x). It is called the dual scalar 

product. 

Topology Associated to a Family of Applications 

We consider a set X and a family of topological spaces (Yi)i∈I. For all i ∈ I, let ϕi denote 

an application from X to Yi. We want to embed X with the coarser topology that makes all 

the applications ϕi continuous. In the following this topology is denoted T. 

Let i be an element of I. Let ωi be an open subset of Yi. If the application ϕi is continuous 

then ϕ −1 

i (ωi) is an open subset of X. Consequently T shall contain the family of subsets of 

X defined by ϕ −1 

i (ωi) when ωi scans all the open subsets of Yi. This remark applies to all 

i ∈ I. We denote (Oλ)λ∈Λ the family composed by all the subsets of the form ϕ −1 

i (ωi), where 

ωi is an open subset of Yi, and i ∈ I. 

2 i.e. A normed vector space, complete with respect to the topology of its norm. 

3 The space of all the continuous linear forms on E. 

123 

s



The problem “find the coarser 4 topology that makes the applications ϕi continuous” is 

transformed into: “find the coarser family of subsets of X that contains (Oλ)λ∈Λ and that is 

stable under any finite intersection and any infinite union of its elements” 5 . 

Consider first the family of all the subsets of X obtained as the intersection of a finite 

number of elements of the form Oλ, λ ∈ Λ. It is a family of subsets of X stable under finite 

intersections. 

Secondly, consider all the infinite unions of elements of this latter family 6 . We finally end 

up with the family 

⎧ 

⎨ 

˜T = 

⎩ 

� 

infinite 

⎡ 

⎣ � 

finite 

Oλ 

⎤ ⎫ 

⎬ 

⎦ , λ ∈ Λ 

⎭ . 

˜T is a topology: stability under infinite union is obvious, and the proof of the stability under 

finite intersection is left as an exercise of sets theory. 

Consider now any topology ˆ T of X that makes all the applications ϕi : X → Yi, i ∈ I 

continuous. Then all the subsets of the form ϕ −1 

i (ωi), ωi being an open subset of Yi, i ∈ I, are 

contained in ˆ T. And since ˆ T is a topology, it contains all the elements of ˜ T. Thus any topology 

that makes all (ϕi)i∈I continuous contains ˜ T: it is the coarser topology we are searching for. 

Definition of the Weak Topology 

For a fixed f ∈ E ∗ , we define the application ϕf : E → R by ϕf (x) =< f, x >. When f 

describes the set E ∗ , we have a family of applications (ϕf )f∈E ∗. 

Definition 70 

The weak topology is the coarser topology that renders all the applications (ϕf )f∈E ∗ 

continuous. 

A weak topology is built on the dual space E ∗ by considering the applications (ϕξ)ξ∈E ∗∗. 

Definition of the Weak-∗ Topology 

We define a canonical injection from E to E ∗∗ : 

− let x ∈ E be fixed 

− E∗ → R 

f ↦→ < f, x > is a continuous linear form on E∗ , that is to say an element of E ∗∗ , 

denoted Jx. 

− we have: 

< Jx, f > (E ∗∗ ,E ∗ )=< f, x > (E ∗ ,E), ∀x ∈ E, ∀f ∈ E ∗ . 

The application x ↦→ Jx is a linear injection. 

4 

The coarser topology is the one that contains “the least” number of open sets. 

5 

Obviously ∅ and X are contained in (Oλ)λ∈Λ. Therefore the stability under infinite unions, and the 

stability under finite intersections define a topology. 

6 

If we perform the infinite union first and then the finite intersection, we obtain a family that is not stable 

under infinite union anymore. 

124


A.3 Uniform Continuity of the Resolvant 

– Let x and y be two elements of E, J(x + y) is the associated element of E ∗∗ . 

For all f ∈ E ∗ : =< x+ y, f >=< x, f > + < y, f >. Thus 

J(x + y) ≡ Jx + Jy. 

– If Jx is an identically null application, then < f, x >= 0∀f ∈ E ∗ , which means 

that x = 0. The kernel is composed of the null element. 

It may happen that J is not a surjective application: E is therefore identified to a subspace 

of E ∗∗ . 

We now consider a fixed x ∈ E and the application ϕx : E ∗ → R defined as f ↦→ ϕx(f) =< 

f, x >. Since x can be any elements of E, we obtain a family of applications: (ϕx)x∈E. 

Definition 71 

The weak-* topology is the coarser topology that renders all the applications (ϕx)x∈E 

continuous. 

The weak-∗ topology makes continuous the family of applications F1 =(ϕx)x∈E and the 

weak topology makes continuous the family of applications F2 =(ϕξ)ξ∈E∗∗. Since E can 

be seen as a subset of E ∗∗ , the family F1 can be seen as a subfamily of F2. Therefore the 

topology constructed using the family F1 is coarser than the one obtained from the family 

F2. 

More details concerning those topologies can be found in [33] (in Moliere’s mother tongue) 

or [107] (in Shakespear’s mother tongue). 


In the proof of some lemmas7 we use the fact that “the weak-* topology on a bounded 

set implies uniform continuity of the resolvent”. We explain how it works in the present 

subsection. We consider a control affine system of the form: 

⎧ 

⎪⎨ 

˙x 

(Σ) 

⎪⎩ 

x(0) 

= 

= 

�i=p 

f(x)+ gi(x)ui, 

i=1 

x0. 

Let PΣ : Dom (PΣ) ⊂ L ∞ ([0; T ] , R nu ) → C 0 ([0; T ] ,X), be the “input to state” mapping 

ofΣ , i.e., the mapping that, to any u(.), associates the corresponding state trajectory 

(t ∈ [0; T ] ↦→ x(t)). 

We endow L ∞ ([0; T ] , R nu ) with the weak-* topology, and C 0 ([0; T ] ,X) with the topology 

of uniform convergence. The following lemma is proven in [57]. 

Lemma 72 

PΣ has open domain and is continuous on bounded sets (w.r.t. the topologies above). 

The proof of this lemma is reproduced below (refer to [57] for the complete statement). 

In order to ease the understanding of this proof we first recall a few facts concerning the 

properties of the weak-* topology. 

7 Lemmas 33 and 38, and their equivalent in Chapter 5. 

125


The weak-* topology on L ∞ sets 


According to what is said in Section A.2, we can embed L ∞ with either the weak or the 

weak-∗ topology. We choose the weak-∗ topology, the coarser one 8 . The following theorem, 

stated with the same notations as before, gives us an important property of this topology: 

Theorem 73 ([33], III-25, Pg 48) 

Let E be a separable Banach space, then BE ∗ = {f ∈ E∗ ; �f� ≤1} is metrizable for the 

weak-∗ topology 9 . 

Conversely, if BE ∗ = {f ∈ E∗ ; �f� ≤1} is metrizable for the weak-∗ topology then E is 

separable. 

Now, recall that (L 1 ) ∗ = L ∞ and L ∞ ⊂ (L 1 ) ∗ [33, 107] and consider the three theorems 

below. 

Theorem 74 ([33]) 

I - IV-7, Pg 57: L p is a normed vector space for 1 ≤ p ≤∞. 

II - (Fischer-Riesz) - IV-8, Pg 57: L p is a Banach space for 1 ≤ p ≤∞. 

III - IV-13, Pg 62: L p is separable 10 for 1 ≤ p



Here G(x(s)) denotes a column vector field whose elements are the gi’s of Σ, and u denote 

the line vector composed of elements ui. 

Proof. 

First of all, let us write: 

x(t) = x0 + 

� � 

t 

f(x(s)) + 

Then 

�xn(t) − x(t)� ≤ 

≤ 

≤ 

0 

i=nu � 

gi(x(s))ui(s) 

� t 

i=1 

= x0 + [f(x(s)) + u(s)G(x(s))] ds. 

0 

�� 

� 

� t 

� 

� 

� [f(xn(s)) + un(s)G(xn(s)) − f(x(s)) − u(s)G(x(s))] ds� 

� 

��0 

� t � 

� 

� f(xn(s)) + un(s)G(xn(s)) + un(s)G(x (s)) 

0 

−un(s)G(x (s)) − f(x(s)) − u(s)G(x(s)) � ds � � 

�� 

� t � 

� 

� f(xn(s)) + un(s)G(xn(s)) − f(x(s)) − un(s)G(x (s)) 

0 

� � 

� 

ds� 

� 

�� 

� t � 

+ � 

� (un(s) − u(s))G(x (s)) � � 

� 

ds� 

� = A + B. 

The terms A and B are such that: 

B ≤ 

�� 

� 

� θ 

� 

sup � 

� (un(s) − u(s))G(x(s))ds� 

� 

θ∈[0;T ] 0 

� t 

� t 

A ≤ �f(xn(s)) − f(x(s))� ds + �(G(xn) − G(x)� |un(s)|ds. 

0 

0 

Because un converges ∗−weakly, the sequence (�un�∞ : n ∈ N) is bounded. The Lipschitz 

� t 

properties of f and G give A ≤ m �xn(s) − x(s)� ds. 

Therefore 

�xn(t) − x(t)� ≤ sup 

θ∈[0;T ] 

The result is given by Gronwall’s lemma. 

0 

0 

�� 

� 

� θ 

� 

� t 

� 

� (un(s) − u(s))G(x(s))ds� 

� + m �xn(s) − x(s)� ds. 

0 

For all θ ∈ [0,T], for all δ > 0, let us consider a subdivision {tj} of [0,T], such that 

θ ∈ [ti,ti+1] and tj+1 − tj < δ for all j. We have, 

�� 

� 

� θ 

� 

� 

� (un(s) − u(s))G(x(s))ds� 

� ≤ 

�� 

� 

� ti 

� 

� 

� (un(s) − u(s))G(x(s))ds� 

� + 

�� 

� 

� θ 

� 

� 

� (un(s) − u(s))G(x(s))ds� 

� . 

0 

0 

127 

0 

� 

ti 

ds


With ε > 0 being given, define ε ∗ = ε 

k 

take δ ≤ ε∗ 

2γm 

, where γ = sup 

x∈R n 

A.4 Bounds on a Gramm Matrix 

, where k comes from the preceding lemma. We 

�G(x)�, and m = sup �un − u�∞. We get, for all θ ∈ [0,T]: 

n 

�� 

� 

� θ 

� 

� 

� (un(s) − u(s))G(x(s))ds� 

� 

ti 

≤ mγδ ≤ ε∗ 

2 

By the weak-* convergence of (un), there exists N ∈ N∗ such that for all n > N, 

�� 

� 

� ti 

� ε∗ 

� 

� (un(s) − u(s))G(x(s))ds� 

� ≤ 

2 . 

0 

By the lemma, there exists N ∈ N ∗ such that for all n > N, for all t ∈ [0,T], �xn(t)−x(t)� ≤ε, 

which proves the sequential continuity. 


This section’s material is taken from [57], Chapter 6, Section 2.4.2. We present a lemma 

useful to investigate the properties of the Riccati matrix of extended Kalman filters, both in 

the continuous and continuous discrete settings. This lemma is used in Appendix B. 

Let Σ be a system12 on Rn , of the form 

⎧ 

where 

⎪⎨ 

(Σ) 

⎪⎩ 

dx 

dτ 

= A(u)x + 

y = C(u)x, 

n� 

k, l =1 

l ≤ k 

uk,lek,lx 

− A(u) is a an anti-shift matrix whose elements never equals zero, and are bounded, 

− C(u) = � α(u) 0 ... 0 � , where α never equals zero and is bounded, 

− ei,j is such that ei,jxk = δjkvi, where {vk} denotes the canonical basis of R n . 

(A.4) 

The term on the right of the “ + ” sign is a lower triangular (n × n) matrix. For (ui,j), 

a measurable bounded control function, defined on [0,T], we denote ψu(t, s) the associated 

resolvent matrix (see Section A.1 above). We define the Gramm observability matrix of Σ by 

� T 

Gu = 

0 

ψ ′ 

u(v, T )C ′ 

Cψu(v, T )dv. 

The matrix Gu is symmetric and positive semi-definite. The system (A.4) is observable for 

all u(.) measurable and bounded. 

12 In order to make this system easier to understand, we only describe it as single output. The result of the 

lemma is though valid for multi outputs systems. 

128



Lemma 76 

If a bound B is given on the controls ui,j, then there exist two positive scalars 0 < α


Appendix B 

Proof of Lemmas 

Contents 

B.1 Bounds on the Riccati Equation . . . . . . . . . . . . . . . . . . . 131 

B.1.1 Part One: the Upper Bound . . . . . . . . . . . . . . . . . . . . . . . 133 

B.1.2 Part Two: the Lower Bound . . . . . . . . . . . . . . . . . . . . . . 142 

B.2 Proofs of the Technical Lemmas . . . . . . . . . . . . . . . . . . . 151 

130


B.1 Bounds on the Riccati Equation 


This section deals with the properties of the Riccati matrix S. We consider the continuous 

discrete framework. The proof follows the ideas of [57] where those properties are investigated 

in details for continuous time systems 1 . 

Lemma 77 

Let us consider the prediction correction equations: 

⎧ 

⎪⎨ 

⎪⎩ 

d ¯ S 

dτ 

� 

= − A(ū)+ ¯ �′ 

b(¯z,ū) 

θ 

R −1 C 

¯Sk(+) = ¯ Sk(−)+θδtC ′ 

with the notations of Section 5.2.1, and the set of assumptions: 

¯S − ¯ � 

S A(ū)+ ¯ � 

b(¯z,ū) 

− 

θ 

¯ SQ¯ S 

− Q and R are fixed symmetric positive definite matrices, 

� 

� 

− the functions ai (u (t)), �˜b ∗ � 

� 

i,j (z, u) �, are smaller than aM > 0, 

− ai (u (t)) ≥ am > 0, 

− θ(0) = 1, and 

(B.1) 

− S(0) is a symmetric positive definite matrix taken in a compact of the form aId ≤ 

S(0) ≤ bId. 

Then, there exist a constant µ, and two scalars 0 < α



− Sn is the set of (n × n) symmetric matrices having their values in R, 

− Sn(+) is the set of positive definite matrices of Sn, 

− Tr(S) denotes the trace of the matrix S, 

− for any matrix S, |S| = � Tr(S ′ S) is the Frobenius norm, when S ∈ Sn, |S| = � Tr(S 2 ), 

− for any matrix S, �S�2 = sup 

�x�2=1 

�Sx�2, is the norm induced by the second euclidean 

norm, we also write it �S� by omission, 

− we keep the τ time scale notation, but we use S instead of ¯ S to ease the reading of 

equations, 

� kδt 

− τk = θ(v)dv. Since θ is fixed during prediction periods, the time elapsed between 

0 

two correction steps is (τk − τk−1) =θk−1δt. 

� 

− A stands for the matrix 

θ. 

A(ū)+ ¯ b(¯z,ū) 

θ 

� 

, we omit to write the dependencies to u, z and 

Matrix facts 

Notice that according to equation (B.1), if S(0) is symmetric then S(t) is symmetric. 

1. On the Frobenius norm (Cf. [67], Section 5.6): 

(a) If U and V are orthogonal matrices then |UAV | = |A|, 

(b) if A is symmetric semi positive then |A| = |DA|, where DA is the diagonal form of 

A, 

(c) if A is as in (b), |A| = �� λ 2 i 

(d) �A�2 ≤ |A| ≤ � (n)�A�2, 

(e) A is symmetric, A ≤�A�2Id ≤ |A|Id. 

2. On the trace of square matrices (Cf. [67]): 

� 1 

2 , where λi ≥ 0 denotes the eigenvalues of A, 

(a) If A, B are (n × n) matrices, then |Tr(AB)| ≤ � Tr(A ′ A) � Tr(B ′ B), 

(b) if S is (n × n), and symmetric semi positive, then Tr(S 2 ) ≤ Tr(S) 2 , 

(c) if S is as in (b) then Tr(S 2 ) ≥ 1 

nTr(S) 2 , 

(d) if S is as in (b) then, Tr(SQS) ≥ q 

ntr(S)2 , with q = min 

�x�=1 x′ Qx, and Q symmetric 

definite positive, 

(e) as a consequence of (b) and (c), if S is as in (b), and �.� denotes any norm on 

(n × n) matrices, then there exist l, n > 0 such that 

l�S� ≤Tr(S) ≤ m�S�. 

3. On inequalities between semi positive matrices (Cf. [67], Sections 7.7 and 7.8), A and 

B are symmetric in this paragraph: 

132


(a) if A ≥ B ≥ 0, and U is orthogonal, then U ′ 

AU ≥ U ′ 

BU, 


(b) A, and B as in (a), and A is invertible, then ρ(BA −1 ) ≤ 1, where ρ denotes the 

spectral radius, 

(c) A, and B as in (a), then λi(A) ≥ λi(B), where λi(M) denote the i th eigenvalue of 

the matrix M sorted in ascending order, 

(d) A, and B as in (a), and both invertible then B −1 ≥ A −1 ≥ 0, 

(e) A, and B as in (a), then det(A) ≥ det(B), and Tr(A) ≥ Tr(B), 

(f) if A1 ≥ B1 ≥ 0, and if A2 ≥ B2 ≥ 0 then A1 + A2 ≥ B1 + B2 ≥ 0. 

4. From facts 1.(c) and 3.(f) we deduce that: 

B.1.1 Part One: the Upper Bound 

(A ≥ B ≥ 0) ⇒ (|A| ≥| B| ≥ 0) . 

This first part of the proof is decomposed into three steps. Let T ∗ be a fixed, positive scalar. 

1. We prove that there is β1 such that for all τk ≤ T ∗ , k ∈ N, Sk(+) ≤ β1Id, 

2. we show that there exists β2 such that for all T ∗ ≤ τk, k ∈ N, Sk(+) ≤ β2Id, 

3. we deduce the result for all times. 

The first fact can be directly proven as follows. 

Lemma 78 

Consider equation (B.1) and the assumptions of Lemma 77. Let T ∗ > 0 be fixed. There 

exists β1 > 0 such that 

Sk(+) ≤ β1Id, 

for all τk ≤ T ∗ , k ∈ N, independently from the subdivision {τi}i∈N. 

Proof. 

For all τ ∈ [τk−1; τk], equation (B.1) gives: 

� τ dS (v) 

S (τ) = Sk−1(+) + 

dτ dv, 

= Sk−1(+) 

� τ 

+ 

τk−1 

τk−1 

⎡ 

⎣− 

� 

A(u)+ � b ∗ (z, u) 

θ 

�′ 

S − S 

Since SQS is symmetric definite positive, we can write 

⎡ 

� � 

τ 

S (τ) ≤ Sk−1(+) + ⎣− A(u)+ �b ∗ �′ 

(z, u) 

S − S 

θ 

and fact 4 gives us 

τk−1 

� τ 

|S(τ)| ≤| Sk−1(+)| + 

133 

τk−1 

� 

A(u)+ � b ∗ (z, u) 

θ 

� 

� 

A(u)+ � b ∗ (z, u) 

θ 

⎤ 

− SQS⎦ 

dv. 

� ⎤ 

⎦ dv, 

2s|S|dv, (B.2)



where AM = sup 

[0;T ∗ � 

� 

(|A(u(τ)|), � 

] 

�b ∗ � 

� 

(z, u) � ≤ Lb and s = AM + Lb. 

Gronwall’s lemma gives 

|S(τ)| ≤| Sk−1(+)|e 2s(τ−τk−1) 

. (B.3) 

Therefore, with c = |C ′ 

RC|, 

|Sk(+)| ≤| Sk−1(+)|e 2s(τk−τk−1) + c (τk − τk−1) . (B.4) 

Consider a subdivision {τk}k∈N such that τ0 = 0. From equation (B.4): 

|S1(+)| ≤| S0|e 2sτ1 + cτ1. 

Since we set θ(0) = 1 then ¯ S0 = S0 and there is no ambiguity in the notation. We iterate to 

obtain 

|S2(+)| ≤| S0|e 2sτ2 + cτ1e 2s(τ2−τ1) + c(τ2 − τ1), 

and for all k ∈ N 

|Sk(+)| ≤| S0|e2sτk �i=k 

+ c (τi − τi−1) e 2s(τk−τi) 

i=1 

≤ |S0|e2sτk + ce2sτk �i=k 

i=1 

(τi − τi−1) e −2sτi . 

Since e−2sτ is a decreasing function of τ, the sum on the right hand side of the inequality is 

smaller than the integral of e−2sτ over the interval [0, τk]. Therefore 

|Sk(+)| ≤| S0|e2sτk + ce2sτk � τk 

e 

0 

−2sτ dτ 

≤ |S0|e2sτk + c 

� 

2sτk 

2s e − 1 � . 

From this last equation we conclude that for any subdivision and any k ∈ N such that τk ≤ T ∗ : 

� 

|Sk(+)| ≤ |S0| + c 

� 

2sT ∗ 

e = β1. 

2s 

In order to prove the result for times greater than T ∗ , consider a symmetric semi positive 

matrix S. We have S ≤ |S|Id = � Tr(S 2 )Id, and according to fact 2.(b): S ≤ Tr(S)Id. 

Therefore, investigations on the upper bound are done in the form of investigations on the 

trace of S. 

Lemma 79 

If S : [0,T[→ Sn is a solution to dS 

dτ 

where: 

= −A′ S − SA − SQS then for almost all τ ∈ [0,T[, 

d 

dτ Tr(S) ≤−a (Tr (S(τ))) 2 +2bTr (S(τ)) , 

a = λmin(Q) 

n 

b = sup τ Tr(A ′ 

(τ)A(τ)) 1 

2 . 

134


Proof. 

d 

dτ Tr(S(τ)) = −Tr(SQS) − Tr(A ′ 


S) − Tr(SA) 

≤ −Tr(SQS)+2|Tr(A ′ 

S)|. 

From the matrix facts: Tr(SQS) > q 

n (Tr(S)) 2 , with q = min 

�x�=1 x′ Qx, and 

|Tr(A ′ 

S)| ≤ � Tr(A ′ A) � Tr(S ′ S) 

≤ Tr(A ′ 

A) 1 � 1 

2 Tr(S) 2� 2 

� 

≤ sup Tr(A 

t 

′ 

A) 1 � 

2 Tr(S). 

Therefore 

d 

dτ Tr(S) ≤−aTr(S) 2 +2bTr(S) 

where a and b are as in the lemma. 

Useful bounds for Tr(S(τ)) can be derived from Lemma 79. In the following, x stands for 

Tr(S). 

Lemma 80 

Let a, b be two positive constants. Let x : [0,T[→ R + (possibly T =+∞) be an absolutely 

continuous function satisfying for almost all 0 < τ 2b 

a 

Proof. 

− If x(τ) ≤ 2b 

a 

− Otherwise, we define 

˙x(τ) ≤−ax 2 (τ)+2bx(τ). 

and 0. The solution x(τ) is such that: 

x(τ) ≤ max � (x(0), 2b 

a )� for all τ ∈ [0,T[. 

then for all τ > 0 ∈ [0,T[ we have the two inequalities: 

⎧ 

⎪⎨ 

⎪⎩ 

x(τ) ≤ 2b 2b 1 

+ 

a a e2bτ − 1 , 

2bx0e 

x(τ) ≤ 

2bτ 

ax0 (e2bτ − 1) + 2b . 

for all τ ∈ [0,T[, trivially we have x(τ) ≤ max (x(0), 2b 

a ). 

E = 

� 

τ ∈ [0,T[: x(τ) > 2b 

� 

a 

(B.5) 

which is a non empty set. Take any connected component ]α,β [⊂ E such that α > 0. 

There are two situations: 

1. β = T , and x(β) ≥ 2b 

2b 

a , x(α) = a , 

135


2. β 0}: it has positive measure. 

Suppose it is not the case: therefore x decreases from time α to time β. This 

implies that x(α) >x(β) which is in contradiction with ⋆. Therefore the 

measure of F is strictly positive. 

However for any τ ∈]α,β [⊂ E we have x(τ) > 2b 

a : −aX2 +2bX < 0 since x(τ) > 2b 

a . 

Then ˙x(τ) < 0 and τ /∈ F , which means that F is not of positive measure. This gives 

us a contradiction: either α = 0 or E = ∅ (first item at the beginning of the proof). 

Consider α = 0: E = [0, τ1[, τ1 > 0 and x(t) ≤ max (x(0), 2b 

a ), for all τ ≥ τ1. For 

τ ∈ [0, τ1[: 

� � 

2b 

˙x(τ) ≤ a − x(τ) x(τ) < 0, 

a 

which means that x(τ) is decreasing on [0, τ1[ (x(τ) 2b 

2b 

a , then x(τ) > a for τ ∈ [0, τ1[ as above. It means that: 

� � 

2b 

˙x ≤ a − x(τ) x(τ). 

a 

We rewrite it: 

Consider 

or in other terms 

Integration with respect to time gives: 

� 

x 

ln 

x − 2b 

� 

a 

− ˙x(τ) 

a � 2b 

a − x� ≥ 1. 

x 

� � 

d ln 

x 

x − 2b 

�� 

a 

= − 2b 

a dx 

x(x − 2b 

a ) 

� � 

a d 

2b dτ 

ln 

x 

x − 2b 

�� 

a 

= 

− ˙x 

(x − 2b ≥ a. 

a )x 

≥ a 2b 

� 

τ + ln 

a 

Therefore, since x0 > 2b 

2b 

a , and x(τ) > a for τ


Since x(τ) > 2b 

a , equation (B.6) implies: 

Which gives the first inequality: 

therefore 

x 

(x − 2b 

a ) ≥ e2bτ . 

x ≥ 

x ≤ 2b 

a 

� 

x − 2b 

� 

e 

a 

2bτ 

+ 2b 

a 

We obtain the second inequality simply from (B.6): 

Remark 81 

x(τ) ≤ 


1 

e2bτ − 1 . 

2bx0e 2bτ 

ax0 (e 2bτ − 1) + 2b . 

1. The first inequality of (B.5) can be used for any initial value S(0), since it tends toward 

+∞ when τ → 0, 

2. the second one requires some knowledge on S(0) in order to be useful, 

3. the two bounds obtained are higher than 2b 

a , therefore they are true for any τ ∈ [0,T]. 

� 

Let us denote r = sup Tr(C ′ 

R−1 � 

C) . According to equation (B.1), the problem turns 

into proving that xk(+), the solution of: 

� 

dx 

dτ = −ax2 xk(+) ≤ 

+2bx 

xk(−)+(τk − τk−1) r, 

is upper bounded for all T ∗ ≤ τk, k ∈ N, independently from the subdivision {τi},i∈ N. 

(B.7) 

The bounds of Lemma 80 are valid on intervals of the form 2 ]τi−1, τi]. Let us find an 

expression that we can use in order to upper bound x at any time. 

Lemma 82 

The solution of (B.7) is such that: 

x(τ) ≤ 2b 

a 

for any τ > 0, before or after an update. 

+ 2b 

a 

1 

e2bτ + rτ, 

− 1 

2 Or of the form [τi−1, τi], it depends on which bound one considers. 

137


Proof. 

The first bound of (B.5) gives: 

and the second is rewritten: 

x2(−) ≤ 2b 

a 

x1(+) ≤ 2b 

a 

+ 2b 

a 

+ 2b 

a 


1 

e2bτ1 + rτ1, 

− 1 

x1(+) − 2b 

a 

x1(+) � e 2b(τ2−τ1) − 1 � + 2b 

a 

We want to replace x1(+) by the upper bound found above. 

Let us define the function 

2bxe 

h(x) = 

2bτ 

ax (e2bτ − 1) + 2b . 

Its derivative w.r.t. x is 

h ′ 

(x) = e2bτ 2b � ax(e 2bτ − 1) + 2b � − a(e 2bτ − 1)xe 2bτ 2b 

[ax(e 2bτ − 1) + 2b] 2 

= 

e 2bτ (2b) 2 

[ax(e 2bτ − 1) + 2b] 

It is positive for all τ > 0, and we can replace x1(+) by its upper bound: 

x2(−) ≤ 2b 2b 

+ 

a a 

� 

2b 2b 1 

a + a e2bτ � 

1−1 

+ rτ1 − 2b 

a 

� �e � 

1 + rτ1 

2b(τ2−τ1) 2b − 1 + 

≤ 2b 

a 

+ 2b 

a 

� 2b 

a 

2b 

a 

+ 2b 

a 

1 

e2bτ1 − 1 

e 2bτ 1−1 

� 2b 

a 

+ 2b 

a 

+ 2b 

a 

� 2b 

a 

2 . 

1 

e2bτ1−1 + 2b 

a 

1 

� �e � 

+ rτ1 

2b(τ2−τ1) 2b − 1 + 

1 

e2bτ1−1 We upper bounds the denominator of the last term with: 

� 

2b 2b 1 

+ 

a a e2bτ1 � � 

+ rτ1 e 

− 1 2b(τ2−τ1) 

� 

− 1 

and the denominator of the second term with: 

� 

2b 2b 1 

+ 

a a e2bτ1 � � 

+ rτ1 e 

− 1 2b(τ2−τ1) 

� 

− 1 

We also simplify (2b/a) in those two terms: 

x2(−) ≤ 2b 

a 

≤ 2b 

a 

+ 2b 

a 

+ 2b 

a 

+ 2b 

a ≥ 

� 2b 

a 

+ 2b 

a 

. 

a 

rτ1 

� �e � . 

+ rτ1 

2b(τ2−τ1) 2b − 1 + 

+ 2b 

a 

≥ 2b 

a , 

1 

e2bτ1 � � 

e 

− 1 

2b(τ2−τ1) 

� 

− 1 + 2b 

a . 

1 

e2bτ1 1 

�� 

− 1 1+ 1 

e2bτ � �e � � + rτ1, 

2b(τ2−τ1) 

1−1 

− 1 +1 

1 

e2bτ2 − e2bτ1 + e2bτ1 + rτ1. 

− 1 

138 

a 

a


Thus we have: 

x2(+) ≤ 2b 

a 

+ 2b 

a 


1 

e2bτ2 + rτ2. 

− 1 

This last inequality is of the same form as the first bound of (B.5), and is independent 

from the values of τ1 and τ2. We can therefore generalize it to any i ∈ N ∗ : 

xi(+) ≤ 2b 

a 

+ 2b 

a 

1 

e2bτi + rτi. 

− 1 

Moreover, we can generalize this inequality to any τ > 0, before or after the update (i.e. the 

correction step): 

x(τ) ≤ 2b 2b 1 

+ 

a a e2bτ + rτ. 

− 1 

Remark 83 

This last bound cannot be used to obtain an upper bound for x for all times τ ≥ T ∗ 

because it is not a decreasing function of time (this is proven later on). In other words, 

suppose we have two times 0 < ξ1 < ξ2, then there exist two positive scalars, β1 and β2 such 

that x(ξ1) < β1 and x(ξ2) < β2. But we don’t know if β2 is higher or not than β1. 

In the following we’ll see that such a relation can be derived provided that the length 

between two samples is bounded. 

Lemma 84 

Let us define the functions 

φ(τ) = 2b 

a 

ψx0 (τ) = 

2b 1 

+ 

a e2bτ + rτ, 

− 1 

2bx0e2bτ ax0 (e2bτ + rτ. 

− 1) + 2b 

There exists µφ > 0, and µψ(x0) > 0 such that φ(τ), respectively ψx0 (τ), is a decreasing 

function for τ ∈]0,µφ], respectively for τ ∈ [0,µψ(x0)]. 

Moreover µψ(x0) is an increasing function of x0. 

Proof. 

The proof basically consists in the computation of the variation tables of the two functions. 

1. 

φ ′ 

Let us consider the polynomial 

(τ) = ra(e2bτ ) 2 − (4b2 +2ar)e2bτ + ra 

a(e2bτ − 1) 2 . 

P (X) =raX 2 − (4b 2 +2ar)X + ra. 

��(2b) � � 

Its discriminant 2 2 

+2ar − 4a2r2 is positive. Therefore P (X) has two real 

roots whose product and sum are both positive. Thus both roots are positive. 

We denote them 0


Those two roots are: 



X∗ = (4b2 +2ar) − � (4b2 +2a) 2 − 4a2r2 � 2ar √ � 

(4b2 +2ar) 2 − 4a2r2 + 4a2r2 − (4b2 +2a) 2 − 4a2r2 ≤ 

, 

2ar 

≤ 1, 

X µ = (4b2 +2ar)+ � (4b2 +2a) 2 − 4a2r2 > 1. 

2ar 

We conclude that there exists µφ > 0 such that φ(τ) is a decreasing function over the 

interval [0,µφ]. 

2. Let us remark first that ψx0 (0) = x0. A few computations give us: 

ψ ′ 

x0 (τ) =ra2 x2 0 (e2bτ ) 2 − x0(ax0 − 2b)(4b2 +2ra)e2bτ + r(ax0 − 2b) 2 

[ax0(e2bτ − 1) + 2b] 2 

. 

As before we consider the polynomial 

Its discriminant is 

P (X) =ra 2 x 2 0X 2 − x0(ax0 − 2b)(4b 2 +2ra)X + r(ax0 − 2b) 2 . 

x 2 0(ax0 − 2b) 2 ((2b) 2 +2ra) 2 − 4r 2 a 2 x 2 0(ax0 − 2b) 2 . 

It is positive. Again both roots are positive: 0 0, a function of x0, such that ψ(τ) is a decreasing function over the 

interval ]0,µψ(x0)]. 

In order to check that µψ(x0) increases with x0, we show that X µ is an increasing 

function of x0. 

X µ = x0(ax0 − 2b)(4b2 +2ar) 

2ra2x2 � 0 

(ax0 − 2b) 

+ 

2 (4b2 +2ar) 2x2 0 − 4r2a2x2 0 (ax0 − 2b) 2 

2ra2x2 = 

0 

(a − 2b 

x0 )(4b2 +2ar) 

2ra2 � 

+ 

(a − 2b 

x0 )2 (4b2 +2ar) 2 − 4r2a2 (a − 2b 

x0 )2 

2ra2 = A+√B D . 

3 Since the coefficient of the second order term of P (X) is positive, then the polynomial has negative sign 

for X ∈]X ∗ ,X µ [. We only to check that P (1) ≤ 0. 

140


− from the definition of r, D>0, 

− (a − 2b 

x0 ) is an increasing function of x0, so is A, 


− B can be rewritten (a − 2b 

x0 )2 � (4b 2 +2ar) 2 − 4r 2 a 2� , the right part is positive, and 

the left one is an increasing function of x0, so is B. 

X µ (x0) is therefore an increasing function of x0, so is µψ(x0) (i.e. P (X) was obtained 

from the change of variables X = e 2bτ ). 

Remark 85 

1. Notice that since ψ(0) = x0, therefore for all τ ∈ [0,µψ(x0)], ψ(τ) ≤ ψ(0) = x0. 

2. Let x∗ 0 > 0 be fixed, and denote ψx∗(τ) the associated ψ function. It is a decreasing 

0 

function over [0,µψ(x∗ 0 )]. The fact that µψ(x∗ 0 ) is an increasing function of x0 tells us 

that for any x0 >x∗ 0 , ψx0 (τ) is a decreasing function of τ over the interval [0,µψ(x∗ 0 )] ⊂ 

[0,µψ(x0)]. 

In other words ψx0 (µψ(x∗ 0 )) 0 be fixed. There 

exist two scalars β2 > 0 and µ>0 such that 

Sk(+) ≤ β2Id, 

for all T ∗ ≤ τk, k ∈ N, for all subdivision {τi}i∈N, τi − τi−1 0 be arbitrarily fixed. We define B(T ∗ )= 2b 

a 

+ 2b 

a 

1 

e 2bT ∗ −1 + rT ∗ . From Lemma 

82, x(T ∗ ) ≤ B(T ∗ ) independently from the subdivision {τi}i∈N. 

Let us define µ as the maximum value such that the functions φ(τ), and ψ B(T ∗ )(τ) of Lemma 

84 are both decreasing functions over the interval [0,µ]. Notice that µ depends on B(T ∗ ). 

We claim that β2 = B(T ∗ )+rµ solves the problem. 

Let us consider a time subdivision {τi}i∈N such that τi − τi−1 ≤ µ, for all i ∈ N. 

We show first that if xk(+) ≤ β2, then xk+1(+) ≤ β2. 

1. If xk(+) ≤ 2b 

a then, from Lemma 80, xk+1(−) ≤ max � 2b 

a ,xk(+) � = 2b 

a . Therefore 

xk+1(+) ≤ 2b 

a + rµ ≤ β2, 

2. if xk(+) ≥ 2b 

a , and xk(+) ≤ B(T ∗ ), we use Lemma 80 again to deduce xk+1(−) ≤ 

max � 2b 

a ,xk(+) � ≤ B(T ∗ ). Finally we have xk+1(+) ≤ B(T ∗ )+rµ = β2, 

141



3. if xk(+) ≥ 2b 

a , and xk(+) ≥ B(T ∗ ) then , Lemma 80 tells us that xk+1 ≤ ψxk (τk+1 −τk). 

From Lemma 84, we know that it is a decreasing function, and according to Remark 

85 we have xk+1(+) ≤ ψxk (0) ≤ β2. 

Let us now define k such that τk−1 0, we have x(T ∗ ) ≤ B(T ∗ ) by definition. From equation (B.7) and Lemma 80: 

− if x(T ∗ ) ≤ 2b 

a therefore xk(−) ≤ 2b 

a ≤ B(T ∗ ), 

− if x(T ∗ ) ≥ 2b dx 

a , then dτ is negative according to equation (B.7), x is decreasing and 

xk(−) ≤ B(T ∗ ). 

In both cases xk(+) ≤ B(T ∗ )+µr = β2 

We finally conclude that, there are two positive constants β2 and µ such that Sk(+) ≤ β2Id 

for all τk ≥ T ∗ , k ∈ N. It is true for all subdivision {τi}i∈N, such that τi − τi−1 ≤ µ, ∀i. 

Remark 87 

The constraint τi − τi−1 ≤ µ, interpreted in the original time scale, gives θiδt ≤ µ, ∀i ∈ N. 

We conclude this subsection giving the upper bound property for all times τ. 

Lemma 88 

Consider the prediction correction equation (B.1) and the hypothesis of Lemma 77. There 

exist two positive constants µ > 0 and β such that S(τ) ≤ βId, for all k ∈ N, and all 

τ ∈ [τk, τk+1[, for all subdivision {τi}i∈N such that (τi − τi−1) ≤ µ. 

This inequality is then also true in the t time scale. 

Proof. 

Let us define ˆ β = max (β2, β1). Lemmas 86 and 78 give us Sk(+) ≤ β2Id, for all k ∈ N. 

Then Lemmas 79 and 80 implies the existence of ˆ β ∗ such that S(τ) ≤ ˆ βId for any τ ∈ 

[τk, τk+1[, and any k ∈ N. 

We define β as the maximum value between ˆ β ∗ and ˆ β. 

B.1.2 Part Two: the Lower Bound 

As in the previous subsection, the proof is separated into three parts. Let T∗ be a fixed 

positive scalar, possibly distinct from T ∗ (Cf. previous subsection). 

1. We prove that there is α1 such that for all k ∈ N, such that τk ≤ T∗, α1Id ≤ Sk(+), 

142



2. we show that there exists α2 such that for all k ∈ N, such that T∗ ≤ τk, α2Id ≤ Sk(+), 

3. we give the result for all times. 

As before, the first fact is proven rather directly. 

Lemma 89 

Consider equation (B.1) and the assumptions of Lemma 77. Let T∗ > 0 be fixed. There 

exists α1 > 0 such that 

α1Id ≤ Sk(+), 

for all τk ≤ T ∗ , k ∈ N, independently from the subdivision {τi}i∈N. 

Proof. 

We denote P = S −1 . For all τ ∈ [τk−1, τk[, equation (B.1) gives: 

� τ 

P (τ) = Pk−1(+) + 

τk−1 

τk−1 

dP (v) 

dτ dv, 

P (τ) = Pk−1(+) 

⎡ 

� � 

τ 

+ ⎣P A(u)+ ˜b ∗ (z, u) 

θ 

Computations performed as in Lemma 78 lead to 

�′ 

+ 

� 

A(u)+ ˜ b ∗ (z, u) 

θ 

|Pk(−)| ≤ (|Pk−1(+)| + |Q| (τk − τk−1)) e 2s(τk−τk−1) 

where s>0 is as before. From (B.1) and Lemma 64 

� 

⎤ 

P + Q⎦ 

dv. 

� 

Pk(+) = Sk(−)+θδtC ′ 

r−1 �−1 C 

= Sk(−) −1 

� 

Sk(−) −1 + θδtSk(−) −1C ′ 

r−1CSk(−) −1 

�−1 Sk(−) −1 

= Sk(−) −1 

� 

Sk(−) − C ′ � r 

θδt + CSk(−) −1C ′� � 

−1 

C Sk(−) −1 

≤ Sk(−) −1 Sk(−)Sk(−) −1 

≤ Pk(−). 

We consider a subdivision of {τk}k∈N such that τ0 = 0. Since ¯ P0 = P0: 

We iterate to obtain 

and for all k ∈ N 

|P2(+)| ≤| P0| e 2sτ2 + |Q| 

|P1(+)| ≤ (|P0| + |Q| τ1) e 2sτ1 . 

� 

τ1e 2sτ2 

� 

2s(τ2−τ1) 

+(τ2 − τ1) e , 

|Pk(+)| ≤| P0| e 2sτk 

�i=k 

+ |Q| (τi − τi−1) e 2s(τk−τi) 

. 

143 

i=1 

(B.8) 

(B.9) 

(B.10)



In the same manner as in Lemma 78 we conclude that for all τk ≤ T∗, k ∈ N: 

� 

|Pk(+)| ≤ |P0| + |Q| 

� 

e 

2s 

2sT∗ = 1 

. 

Therefore Pk(+) ≤ 1 

α1 Id, for all τk ≤ T∗, k ∈ N, for all subdivisions {τi}i∈N. 

Equivalently, from matrix fact 3.(c): α1Id ≤ Sk(+). 

We now proceed with the proof of the second part of this subsection. We need a different 

set of tools than the one used in the preceding section. It is a series of lemmas adapted from 

[57]. As we will see, in addition to the proof of the existence of a lower bound for S, we also 

prove that it is a symmetric positive definite matrix for all times. 

Lemma 90 

For any λ ∈ R ∗ , any solution S : [0,T[→ Sm (Possibly, T =+∞) of 

we have for all τ ∈ [0,T[: 

dS 

dτ 

S(τ) =e−λτ φu(τ, 0)S0φ ′ 

u(τ, 0) 

� τ 

+λ 

0 

α1 

= −A′ (τ)S − SA(τ) − SQS, 

e −λ(τ−v) � 

φu(τ,v) S(v) − S(v)QS(v) 

� 

φ 

λ 

′ 

u(τ,v)dv 

where φu(τ,s) is such that: � dφu(τ,s) 

dτ = −A ′ 

(τ)φu(τ,s), 

φu(s, s) = Id. 

(B.11) 

Remark 91 

Notice that S(τ) is not λ-dependent. This latter scalar is used only to provide us with a 

convenient way to express S(τ). 

Proof. 

Let us consider an equation of the form 

φu(τ,s) denotes the resolvent of the system dx 

dτ 

d 

Λ(τ) =−A′ Λ(τ) − Λ(τ)A + F (τ). (B.12) 

dτ 

dφu(τ,s) 

= −A′ x: 

dτ = −A ′ 

φu(τ,s), 

φu(s, s) = Id. 

We search for a solution of the formΛ( τ) =φu(τ,s)h(τ)φ ′ 

u(τ,s). 

dΛ 

dτ = � d 

dτ φu(τ,s) � h(τ)φ ′ 

� 

d 

u(τ,s)+φu(τ,s)h(τ) dτ φ′ 

� 

u(τ,s) 

= −A ′ 

φu(τ,s)h(τ)φ ′ 

u(τ,s) − φu(τ,s)h(τ)φ ′ 

+φu(τ,s) � d 

dτ h(τ)� φ ′ 

u(τ,s)A(τ) 

= −A ′ 

Λ(τ) − Λ(τ)A + φu(τ,s) � d 

dτ h(τ)� φ ′ 

u(τ,s), 

144 

+φu(t, s) � d 

dτ h(τ)� φ ′ 

u(τ,s) 

u(τ,s)


therefore φu(τ,s) � d 

dτ h(τ)� φ ′ 

u(τ,s)=F (τ), and 

� τ 

h(τ) =h(0) + 

SinceΛ( τ) =φu(τ,s)h(τ)φ ′ 

u(τ,s), thus φu(s, 0)Λ(0)φ ′ 

0 


φu(s, v)F (v)φ ′ 

u(s, v)dv. 

Λ(τ) = 

u(s, 0) = h(0). Therefore: 

φu(τ,s)h(0)φ ′ 

�� τ 

u(τ,s)+φu(τ,s) φu(s, v)F (v)φ 

0 

′ 

� 

u(s, v)dv φ ′ 

= 

u(t, s) 

φu(τ,s)φu(s, 0)Λ(0)φ ′ 

u(s, 0)φ ′ 

� τ 

u(τ,s)+ φu(τ,v)F (v)φ 

0 

′ 

= 

u(τ,v)dv 

φu(τ, 0)Λ0φ ′ 

� τ 

u(τ, 0) + φu(τ,v)F (v)φ ′ 

u(τ,v)dv. 

0 

Let us take λ ∈ R ∗ , and define ˆ S = e λτ S. The differential equation associated to ˆ S is 

d 

dτ ˆ S(τ) = λeλτ S + eλτ dS 

dτ 

= λ ˆ S(τ) − A ′ S(τ) ˆ − S(τ)A ˆ − e−λτ S(τ)Q ˆ S(τ). ˆ 

This equation is of the form (B.12), with F (τ) = λ ˆ S(τ) − e−λτ SQ ˆ S. ˆ According to the 

computation above, we have: 

ˆS(τ) =φu(τ, 0) ˆ S(0)φ ′ 

� τ � 

u(τ, 0) + φu(τ,v) λ ˆ S(v) − e −λv � 

SQ ˆ Sˆ 

φ ′ 

u(τ,v)dv, 

and consequently 

S(τ) =φu(τ, 0)S0φ ′ 

u(τ, 0)e −λτ � τ 

+ λ 

Lemma 92 

Let S : [0; e(S)[→ Sm be a maximal semi solution of 

If S(0) = S0 is positive definite then 

0 

0 

e −λ(τ−v) � 

φu(τ,v) S − SQS 

� 

φu(τ,v)dv. 

λ 

d 

S = −A′ S − SA − SQS. 

dτ 

e(S) =+∞ and S(τ) is positive definite for all τ ≥ 0. 

Thus, for any arbitrary time subdivision {τi}i∈N∗ the solution to the continuous discrete 

Riccati equation (B.1) is positive definite for all times provided that S0 is positive definite. 

Proof. 

Assume that S is not always positive definite. Let us define 

θ = inf (τ|S(τ) /∈ Sm(+)). 

In other words S(τ) ∈ Sm(+) for all τ ∈ [0; θ[. From Lemma 79: 

d 

dτ Tr(S) ≤−aTr(S) 2 +2bTr(S) 

145



which, in combination with Lemma 80, gives Tr(S) ≤ max � Tr(S0), 2b 

� 

a and 

|S| = � Tr(S 2 ) ≤ � Tr(S) 2 � 

= Tr(S) ≤ max Tr(S0), 2b 

Choose λ > |Q| max � Tr(S0), 2b 

� 

a and apply Lemma 90: 

S(τ) =e −λτ φu(τ, 0)S0φ ′ 

� τ 

u(τ, 0) + e 

0 

−λ(τ−v) φu(τ,v) 

Let τ be equal to θ then S(θ) = (I)+(II) with 

(I) is definite posittive � 

(II) depends on 

0 

a 

� 

. 

� 

S(τ) − S(τ)QS(τ) 

� 

φ 

λ 

′ 

u(τ,v)dv. 

(I) = e−λθφu(θ, 0)S0φ ′ 

u(θ, 0), 

� θ 

(II) = e −λ(θ−v) � 

φu(θ,v) S − SQS 

� 

φ 

λ 

′ 

u(θ,v)dv. 

S − SQS 

λ 

� 

, which we can rewrite √ � 

S Id − 

itive definite for 0 < τ



Proof. 

Let M(t) be a primitive matrix of m(t), that is to say a matrix whose elements are the 

primitives of the elements of m(t). We have the identity 

� T 

m(v)dv = M(T ) − M(0) = 

0 

k� 

[M(τi) − M(τi−1)] . 

We can apply the Taylor-Lagrange expansion on all elements Mkl: 

i=1 

Mkl(τi−1) =Mkl(τi)+(τi−1 − τi) mkl(τi)+ (τi−1 − τi) 2 

where ξkl,i ∈ [τi−1, τi]. We have thus, the relation 

k� 

M(τi−1) = 

i=1 

k� 

M(τi)+ 

i=1 

k� 

m(τi) ((τi−1 − τi)) + 

where (Rkl)i = m ′ 

kl (ξkl,i), with ξkl,i ∈ [τi, τi−1]. Therefore 

� T 

m(v)dv − 

0 

i=1 

k� 

m(τi)(τi − τi−1) = 

i=1 

i=1 

= � � 

k (τi−1−τi) 

i=1 

2 

2 

Ri 

2 

k� 

i=1 

k� 

[M(τi) − M(τi−1)] − 

� 

. 

m ′ 

kl (ξkl,i) 

� 

(τi−1 − τi) 2 

2 

Ri 

� 

k� 

((τi − τi−1)m(τi)) 

We now use the definition of the matrix inequality. Let x be a non zero element of Rn , 

x ′ 

� 

k� 

� 

(τi−1 − τi) 

i=1 

2 

�� 

Ri x 

2 

= 1 

k� � 

(τi−1 − τi) 

2 

i=1 

2 x ′ 

� 

Rix 

≤ 1 

⎛ 

⎞ 

≤ 

k� 

� 

⎝(τi−1 

2 

− τi) |xk||Rk,l| 

2 

i |xl| ⎠ 

i=1 

k,l 

1 � � 

µ max |Rk,l| 

2 k,l,i 

i 

� � 

�k 

τi−1 − τi 

⎛ 

⎝ � 

⎞ 

|xk||xl| ⎠ 

Thus giving the result. 

≤ 1 � 

µ max |Rk,l| 

2 k,l,i 

i 

≤ 1 

2 

� 

µ max |Rk,l| 

k,l,i 

i 

i=1 

� � k � 

� T 1 

2 

i=1 

τi−1 − τi 

� 

� 

2n �x� 2� 

. 

i=1 

1 

2 

k,l 

⎛ 

⎝ � 

|xk| 2 + |xl| 2 

⎞ 

⎠ 

Remark 94 

Suppose that m = ψ ′ 

(v, T )C ′ 

Cψ(v, T ) where ψ(v, T ) is a resolvent matrix as in Lemma 

90. From the assumptions we put on our continuous discrete system, and the fact that the 

147 

k,l



observation matrix is fixed (i.e. not u-dependent), the derivative of such a m matrix is 

bounded. 

If we want to consider C matrices defined with a function α1 as in the continuous case, 

the derivative of this function must have a bounded time derivative. The consequence is that 

bang-bang like control inputs become inadmissible. 

We now have gathered all the elements necessary to prove the second half of this subsection’s 

result. 

Lemma 95 

Consider equation (B.1) and the assumptions of Lemma 77. Let T∗ > 0 be fixed. There 

exist α2 > 0 and µ>0 such that for all subdivision {τi}i∈N with τi − τi−1 β |Q|: 

S1(+) = e−λτ1φu(τ1, 0)S0φ ′ 

� u(τ1, 0) 

τ1 

+λ e −λ(τ1−v) 

φu(τ1,v) 

In the same manner we obtain at time τ2: 

τ1 

0 

S2(+) = e−λ(τ2−τ1) φu(τ2, τ1)S1(+)φ ′ 

� u(τ2, τ1) 

τ2 

+λ e −λ(τ2−v) 

� 

φu(τ2,v) S(v) − S(v)QS(v) 

λ 

� 

S(v) − S(v)QS(v) 

� 

φ 

λ 

′ 

u(τ1,v)dv + C ′ 

R −1 Cτ1. 

� 

φ ′ 

u(τ2,v)dv + C ′ 

R −1 C(τ2 − τ1). 

A few computations (see Appendix A.1 for the resolvent’s properties) lead to: 

S2(+) = e−λτ2φu(τ2, 0)S0φ ′ 

� u(τ2, 0) 

τ2 

+λ e 

0 

−λ(τ2−v) 

� 

φu(τ2,v) S(v) − S(v)QS(v) 

� 

φ 

λ 

′ 

u(τ2,v)dv 

+e−λ(τ2−τ1) φu(τ2, τ1)C ′ 

R−1C(τ2 − τ1). 

R −1 Cφ ′ 

u(τ2, τ1)τ1 + C ′ 

4 The matrix inequality we want to achieve implies, in particular, the invertibility of the matrices Sk(+), 

k ∈ N. It implies the invertibility of the sum that appears in (B.13). This matrix cannot be inverted if there 

are less than n summation terms. 

As is explained at the end of the proof, the result is achieved provided that the step size of the subdivision 

used in the summation is small enough. Since we have the freedom to make µ as small as we want, the 

summation above mentioned can be performed with sufficiently many points whatever the value of T ∗ . It 

makes the requirement T∗ ≥ nµ redundant. 

Recall now that µ represents the maximum sampling that allows us to use the observer. Lemma 89 provides 

us with a first value for µ, and it is pretty much system dependent. By removing the assumption, we probably 

have to shorten µ for the sake of mathematical elegance only. 

We think that the proof is better that way. 

148


For any k, we compute Sk(+) iteratively: 


Sk(+) = e−λτkφu(τk, 0)S0φ ′ 

� u(τk, 0) 

τk 

+λ e 

0 

−λ(τk−v) 

� 

φu(τk,v) S(v) − S(v)QS(v) 

� 

φ 

λ 

′ 

u(τk,v)dv 

k� 

+ 

i=1 

e −λ(τk−τi) φu(τk, τi)C ′ 

R −1 Cφ ′ 

u(τk, τi)(τi − τi−1) . 

(B.13) 

We consider this last equation for any k ∈ N ∗ such that τk ≥ T∗. It is of the form 

Sk(+) = (I)+(II)+(III), in the same order as before. 

− Since S0 is positive definite, (I) is positive definite, 

� 

� 

− from the definition of λ, 

is positive definite, and the same goes for 

(II), 

S(v) − S(v)QS(v) 

λ 

− let us define l < k as the maximal element of N such that τk − τl ≥ T∗. Since we 

consider (B.13) for times τk ≥ T∗ such an element always exists 5 . Note that because 

the subdivisions we use have a step less than µ, τk − τl ≤ T∗ + µ. All the terms of the 

sum (III) are symmetric definite positive matrices and thus 

(III) ≥ 

k� 

i=l+1 

e −λ(τk−τi) φu(τk, τi)C ′ 

R −1 Cφ ′ 

u(τk, τi)(τi − τi−1) . 

From the properties of the resolvent, the right hand side expression can be rewritten, 

with ū(τ) =u(τ + τl): 

k� 

i=l+1 

e −λ(τk−τi) φū(τk − τl, τi − τl)C ′ 

R −1 Cφ ′ 

ū(τk − τl, τi − τl)(τi − τi−1) . (B.14) 

For all i = {l +1, .., k} we have e −λ(τk−τi) ≥ e −λ(T∗+µ) , and R −1 is a fixed symmetric 

positive definite matrix. Therefore in order to lower bound (B.14) we have to find a 

lower bound for 

k� 

i=l+1 

φū(τk − τl, τi − τl)C ′ 

Cφ ′ 

ū(τk − τl, τi − τl)(τi − τi−1) . (B.15) 

Note that this sum is now computed for a subdivision of the interval [0, τk − τl] that 

has the very specific property T∗ ≤ τk − τl ≤ T∗ + µ. Let us redefine this subdivision as 

follows: 

– let k∗ be equal to k − l, the subdivision has k∗ + 1 elements, 

– we denote ˜τi = τi+l − τl –i.e. ˜τ0 = 0 and ˜τk∗ = τk − τl. 

5 It can happen that k = 0. Recall that τ0 = 0 for all subdivisions. 

149


We can prove that (III) has a lower bound if we can prove that 

k∗ � 

Gū,d = 

i=1 


′ 

φū(˜τk∗ , ˜τi)C Cφ ′ 

ū(˜τk∗ , ˜τi) (˜τi − ˜τi−1) 

is lower bounded, for all subdivisions {˜τi} i∈{0,...,k∗} such that ˜τi+1 − ˜τi ≤ µ and that 6 

T∗ ≤ ˜τk∗ ≤ T∗ + µ. 

Let us define ψū(t, s) =(φ −1 

ū (t, s)) ′ 

. Since φū is the resolvent of dx 

dτ 

= −A′ x, therefore 

ψū is the resolvent of dx 

dτ = Ax, (Cf. Appendix A.1). 

Actually7 Gū,d is the Gramm observability matrix of the continuous discrete version of 

a system of the form (A.4). 

Let us denote Gū(T ) the grammian of (A.4) when the integral is computed from 0 to 

T . Let us apply Lemma 76 on Gū(T∗). There is a a>0, independent from ū, such that 

aId ≤ Gū(T∗) ≤ Gū(˜τk∗ ). 

From Lemma 93 and Remark 94 there is a K>0 such that 

Therefore 

aId ≤ Gū(˜τk∗ ) − Gū,d + Gū,d ≤ Gū,d + µK˜τk ∗Id 

≤ Gū,d + µK(T∗ + µ)Id. 

[a − µK(T∗ + µ)] Id ≤ Gū,d, 

and µ can be shortened such that (a − µK(T∗ + µ)) > 0, independently from the 

subdivision {˜τi}. 

We conclude that there exists a α2 > 0 such that α2Id ≤ (III). 

Consequenltly α2Id ≤ Sk(+), for all k such that τk ≥ T∗. 

We finally state the equivalent of Lemma 88. 

Lemma 96 

Consider the prediction correction equation (B.1) and the hypothesis of Lemma 77. There 

exist two positive constants µ > 0 and α such that αId ≤ S(τ), for all k ∈ N, and all 

τ ∈ [τk, τk+1[, for all subdivisions {τi}i∈N such that (τi − τi−1) ≤ µ. 

This inequality is then also true in the t time scale. 

Lemma 77 is the sum of Lemmas 88 and 96. 

6 This second hypothesis implies that k∗ + 1, the number of elements can vary from one subdivision to the 

other. 

7 Notice that with the ψū notation 

Gū,d = 

k∗� 

i=1 

ψ ′ 

ū(˜τi, ˜τk∗)C ′ 

Cψū(˜τi, ˜τk∗) (˜τi − ˜τi−1) . 

150


B.2 Proofs of the Technical Lemmas 


Lemma 97 ([38]) Let {x (t) > 0,t≥ 0} ⊂ R n be absolutely continuous, and satisfying: 

dx(t) 

dt ≤−k1x + k2x √ x, 


4k2 , x(t) ≤ 4 x (0) e 

2 

−k1t . 

Proof. 

We make three successive change of variables: y = √ x, z = 1 

y and w(t) =e− k 1 2 t z(t). 

Then all the quantities y(t) z(t), w(t) are positive and absolutely continuous on any finite 

time interval [0,T]. We have 

˙y = 1 

2 √ x 

k2 

˙x ≤ −k1 

2 y + 2 y2 

˙z = − 1 

y2 ˙y ≥ k1 k2 

2 z − 2 

˙w = − k1 

2 e− k1 t − 2 z(t)+e k1 t k2 

2 ˙z ≥ − 2 e− k1 t 2 

Moreover, w(0) = 1 √ . Then, for almost all t>0, 

x(0) 

If 1 

√x(0) − k2 

k1 

w(t) ≥ 

1 

� x(0) − k2 

k1 

+ k2 

e 

k1 

− k1 t 2 

> 0, then w(t) > 0 and we can go backwards in the previous inequalities: 

w(t) ≥ √x(0) 1 − k2 

� 

1 − e k1 

− k1 t 2 � 

z(t) ≥ e k � 

1 t 2 √ 1 − 

x(0) k2 

� 

k1 

+ k1 

y(t) ≤ 1 

2 

. 

x(t) ≤ 

Hence, if x(0) ≤ k2 1 

4k2 , or 1 − 

2 

� x(0) k2 

k1 

e k � 

1 

2 

t 

√x(0) 

1 

− k � 

2 + k1 

k1 2 

x(0)e−k1t � 

1− √ x(0) k �2 2 

k1 

1 ≥ 2 , then: 

x(t) ≤ 4x(0)e −k1t 

Lemma 98 ([38]) 

Consider �b (˜z) − �b (˜x) − �b ∗ (˜z)˜ε that appears in the inequality (3.12) (omitting to write u 

in ˜ � 

� 

b) and suppose θ ≥ 1. Then ��b (˜z) − �b (˜x) − �b ∗ � 

� 

(˜z)˜ε � ≤ Kθn−1 |˜ε| 2 , for some K>0. 

Proof. 

Let us denote ε = z − x and consider a smooth expression E(z, x) of the form: 

E(z, x) =f(z) − f(x) − df (z)ε, 

151


where f : R p → R is compactly supported. 

We have, for t>0: 

and: 

Hence, 

f(z − tε) =f(z) − 

∂f 

(z − τε)= 

dxi 

∂f 

(z) − 

∂xi 

f(z − ε) =f(z) − 

p� 

i=1 

εi 

∂f 

(z)+ 

∂xi 

Since f is compactly supported, we get : 

p� 

i=1 

p� 

j=1 

εi 

p� 

i,j=1 

εi 


� t ∂f 

(z − τε)dτ, 

∂xi 

0 

� τ ∂2f (z − θε)dθ. 

∂xi∂xj 

0 

εiεj 

|f(z) − f(z − ε) − df (z)ε| ≤ M 

2 

� 1 � τ ∂2f (z − θε)dθdτ. 

∂xi∂xj 

0 

0 

p� 

|εiεj|, 

where M = sup x | ∂2 f 

∂xi∂xj (x)|. 

Now we take f = ˜ bk, and we use the facts that ˜ bk depends only on x1, ..., xk, and that 

θ ≥ 1: 

This gives the result. 

i,j=1 

| ∂2˜bk (x)| ≤ θ 

∂xi∂xj 

k−1 | ∂2bk (∆ 

∂xi∂xj 

−1 x)|. 

152


Appendix C 

Source Code for Realtime 

Implementation 

Contents 

C.1 Replacement Code for the File: rtai4 comedi datain.sci ..... 154 

C.2 Computational Function for the Simulation of the DC Machine 155 

C.3 AEKF Computational Functions C Code . . . . . . . . . . . . . . 157 

C.4 Innovation Computational Functions C Code . . . . . . . . . . . . 159 

C.5 Ornstein-Ulhenbeck Process . . . . . . . . . . . . . . . . . . . . . . 162 

153


C.1 Replacement Code for the File: rtai4 comedi datain.sci 

C.1 Replacement Code for the File: rtai4 comedi datain.sci 

function [ x , y , typ ] = r t a i 4 c o m e d i d a t a i n ( job , arg1 , arg2 ) 

x = [ ] ; y = [ ] ; typ = [ ] ; 

select job 

case ’ p l o t ’ then 

exprs=arg1 . g r a p h i c s . exprs ; 

ch=exprs ( 1 ) 

name=exprs ( 2 ) 

standard draw( arg1 ) 

case ’ g e t i n p u t s ’ then 

[ x , y , typ ]= s t a n d a r d i n p u t s ( arg1 ) 

case ’ g etoutputs ’ then 

[ x , y , typ ]= s t a n d a rd o u t p u ts ( arg1 ) 

case ’ g e t o r i g i n ’ then 

[ x , y]= standard origin ( arg1 ) 

case ’ s e t ’ then 

x=arg1 

model=arg1 . model ; g r a p h i c s=arg1 . g r a p h i c s ; 

exprs=g r a p h i c s . exprs ; 

while %t do 

[ ok , ch , name , range , a r e f , exprs ]=getvalue ( ’ Set RTAI−.. 

COMEDI DATA block parameters ’ , [ ’ Channel : ’ ; ’ Device : .. 

’ ; ’ Range : ’ ; ’ Aref : ’ ] , l i s t ( ’ vec ’ , −1 , ’ s t r ’ , 1 , ’ vec ’.. 

, −1 , ’ vec ’ , −1) , exprs ) 

if õk then break , end 

if exists ( ’ outport ’ ) then out=ones ( outport , 1 ) , i n = [ ] , 

else out =1, i n = [ ] , end 

[ model , graphics , ok]= check io( model , graphics , in , out .. 

, 1 , [ ] ) 

if ok then 

g r a p h i c s . exprs=exprs ; 

model . i p a r =[ch ; 

range ; 

a r e f ; 

length (name) ; 

a s c i i (name) ’ ] ; 

model . rpar = [ ] ; 

model . d s t a t e = [ 1 ] ; 

x . g r a p h i c s=g r a p h i c s ; x . model=model 

break 

end 

end 

case ’ d e f i n e ’ then 

ch=0 

name= ’ comedi0 ’ 

range=0 

a r e f =0 

model=s c i c o s m o d e l ( ) 

model . sim=l i s t ( ’ r t c o m e d i d a t a i n ’ , 4 ) 

154


C.2 Computational Function for the Simulation of the DC Machine 

if exists ( ’ outport ’ ) then model . out=ones ( outport , 1 ) , .. 

model . i n = [ ] , 

else model . out =1, model . i n = [ ] , end 

model . e v t i n=1 

model . rpar =[] 

model . i p a r =[ch ; 

range ; 

a r e f ; 

length (name) ; 

a s c i i (name) ’ ] 

model . d s t a t e = [ 1 ] ; 

model . blocktype= ’ d ’ 

model . dep ut =[%t %f ] 

exprs =[sci2exp ( ch ) ,name , sci2exp ( range ),sci2exp ( a r e f ) ] 

gr i =[ ’ x s t r i n g b ( o r i g ( 1 ) , o r i g ( 2 ) , [ ’ ’COMEDI A/D ’ ’ ; name+ ’ ’ .. 

CH− ’ ’+s t r i n g ( ch ) ] , sz ( 1 ) , sz ( 2 ) , ’ ’ f i l l ’ ’ ) ; ’ ] 

x=standard define ( [ 3 2 ] , model , exprs , g r i) 

end 

endfunction 

C.2 Computational Function for the Simulation of the DC 

Machine 

This first code are given only to provide a complete picture with regard to the implementation. 

This simulation presents no difficulties. Let us remark that an alternative solution to the 

definition of the parameter in the code itself is to pass them through the real parameter vector 

entry of the interfacing function, see [41]. 

function name DCmachine number of zero crossing 

surfaces 

0 

implicit n initial discrete state [] 

input port size 2 real parameter vector [] 

output port size 2 integer parameter vector 

[] 

input event port size [] initial firing vector [] 

output event port size [] direct feed through y 

initial continuous state ⊛ time dependence y 

Table C.1: Arguments of the DC Simulation. 

⊛ =[I(0), ωr(0)]. 

#include 

#include 

#include 

void DCmachine ( s c i c o s b l o c k ∗ block , int f l a g ) 

{ /∗ t h i s i s t h e l i s t o f t h e f i e l d s we have to use 

155


} 

C.2 Computational Function for the Simulation of the DC Machine 

i n t b l o c k −>n e v p r t ; 

i n t b l o c k −>nz ; 

d o u b l e ∗ b l o c k −>z ; 

i n t b l o c k −>nx ; 

d o u b l e ∗ b l o c k −>x; 

d o u b l e ∗ b l o c k −>xd ; 

d o u b l e ∗ b l o c k −>r e s ; 

i n t b l o c k −>nin ; 

i n t ∗ b l o c k −>i n s z ; 

d o u b l e ∗∗block −>i n p t r ; 

i n t b l o c k −>nout ; 

i n t ∗ b l o c k −>o u t s z ; 

d o u b l e ∗∗block −>o u t p t r ; 

i n t b l o c k −>nevout ; 

i n t b l o c k −>nrpar ; 

d o u b l e ∗ b l o c k −>rpar ; 

i n t b l o c k −>nipar ; 

i n t ∗ b l o c k −>i p a r ; 

i n t b l o c k −>ng ; 

d o u b l e ∗ b l o c k −>g; 

i n t ∗ b l o c k −>j r o o t ; 

char b l o c k −>l a b e l [ 4 1 ] ; 

∗/ 

if ( f l a g == 4) { /∗ i n i t i a l i z a t i o n ∗/ 

DCmachine bloc init ( block , f l a g ) ; 

} else if( f l a g == 1) { /∗ o u tput computation ∗/ 

set b l o c k e r r o r ( DCmachine bloc outputs ( block , f l a g ) ) ; 

} else if( f l a g == 0) { /∗ d e r i v a t i v e or r e s i d u a l .. 

computation ∗/ 

set b l o c k e r r o r ( DCmachine bloc deriv ( block , f l a g ) ) ; 

} else if ( f l a g == 5) { /∗ ending ∗/ 

set b l o c k e r r o r ( DCmachine bloc ending ( block , f l a g ) ) ; 

} 

int DCmachine bloc init ( s c i c o s b l o c k ∗ block , int f l a g ) 

{return 0;} 

int DCmachine bloc outputs ( s c i c o s b l o c k ∗ block , int f l a g ) 

{ block−>outptr [ 0 ] [ 0 ] = block−>x [ 0 ] ; 

block−>outptr [ 0 ] [ 1 ] = block−>x [ 1 ] ; 

return 0;} 

int DCmachine bloc deriv ( s c i c o s b l o c k ∗ block , int f l a g ) 

{double L=1.22 , Res =5.4183 , Laf =0.068; 

double J =0.0044 , B=0.0026 , p=0; 

double I=block−>x [ 0 ] , wr=block−>x [ 1 ] ; 

double V=block−>i n p t r [ 0 ] [ 0 ] , Tl=block−>i n p t r [ 0 ] [ 1 ] ; 

block−>xd [ 0 ] = (V−Res∗ I−Laf ∗wr∗ I ) /L ; 

block−>xd [ 1 ] = ( Laf ∗ I ∗ I−B∗wr−p∗pow( wr , 2 . 0 8 )−Tl ) /J ; 

return 0;} 

156


int DCmachine bloc ending ( s c i c o s b l o c k ∗ block , int f l a g ) 

{return 0;} 

C.3 AEKF Computational Functions C Code 


In the following code, we used the notation: 

− the array Ab[.] is composed of the elements of the matrix A(u)+b ∗ (x, u), 

− the variables of the form Pij are the elements of the matrix P , 

− the tuning matrices of the EKF are set to r = 1 and Q = diag({q1,q2,q3}). 

Those two matrices are organized as follows: 

⎛ 

Ab[0] Ab[3] 0 

⎞ 

⎝ Ab[1] Ab[4] Ab[7] ⎠ and 

Ab[2] Ab[5] Ab[8] 

⎛ 

⎝ 

P 11 P 21 P 31 

P 21 P 22 P 23 

P 31 P 23 P 33 

function name AEKF number of zero crossing 

surfaces 

0 

implicit n initial discrete state [] 



[] 

input event port size [] initial firing vector [] 


initial continuous state ⊛ time dependence y 

Table C.2: Arguments of the Main Function. 

⊛ =[I(0); ωr(0); Tl(0); P (0); θ(0)], initial state is a dim 10 vector. 

#include 

#include 

#include 

/∗ i n p u t = [V;Y; I ( t ) ] ∗/ 

void AEKF( s c i c o s b l o c k ∗ xblock , int f l a g ) 

{ 

int n=3; 

int N=n ∗( n+1) / 2 ; 

if ( f l a g ==4) 

{/∗ INITIALISATION ∗/ 

/∗ change o f v a r i a b l e s from o r i g i n a l c o o r d i n a t e s −> normal .. 

form c o o r d i n a t e s ∗/ 

157 

⎞ 

⎠



block−>x [1]= block−>x [ 1 ] ∗ block−>x [ 0 ] ; 

block−>x [2]= block−>x [ 2 ] ∗ block−>x [ 0 ] ; 

} 

else if ( f l a g ==1) 

{/∗ OUTPUT ∗/ 

/∗ change o f v a r i a b l e s , normal form c o o r d i n a t e s −> o r i g i n a l .. 

c o o r d i n a t e s ∗/ 

block−>outptr [ 0 ] [ 0 ] = block−>x [ 0 ] ; 

block−>outptr [ 0 ] [ 1 ] = block−>x [ 1 ] / block−>x [ 0 ] ; 

/∗ Estimated t o r q u e ∗/ 

block−>outptr [ 0 ] [ 2 ] = block−>x [ 2 ] / block−>x [ 0 ] ; 

/∗ High−gain parameter ∗/ 

block−>outptr [ 0 ] [ 3 ] = block−>x [ n+N ] ; 

} 

else if ( f l a g ==0) 

{/∗ DERIVATIVE ∗/ 

double p=0,K2=0.068 , J =0.0044 , B=0.0026; 

double L=1.22 , Res =5.4183 , K1=0.068; 

double DT=0.01 , theta1 =1.25 , lambda=100 , Beta =2000; 

double m1=0.005 ,m2=0.004 ,m=m1+m2; 

double V=block−>i n p t r [ 0 ] [ 0 ] ; 

double Y=block−>i n p t r [ 0 ] [ 1 ] ; 

double Inn=block−>i n p t r [ 0 ] [ 2 ] ; 

double SI , F0 ; 

double z1=block−>x [ 0 ] , z2=block−>x [ 1 ] ; 

double z3=block−>x [ 2 ] , t heta=block−>x [ n+N ] ; 

double e r r e u r=Y−z1 ; 

double Ab [ 9 ] ; 

double P11=block−>x [ 3 ] , P21=block−>x [ 4 ] , P31=block−>x [ 5 ] , P22=.. 

block−>x [ 6 ] , P32=block−>x [ 7 ] , P33=block−>x [ 8 ] ; 

double q1=1, q2 =0.1 , q3 =0.01; 

if(Yxd [ 0 ] = (V−Res∗z1−K1∗ z2 ) /L+theta ∗ block−>x [ n+0]∗ e r r e u r ; .. 

// dz1 / dt =...+P11∗ e r r e u r 

block−>xd [ 1 ] = (V∗ z2 /z1−Res∗z2−K1∗ z2∗ z2 / z1 ) /L+(K2∗ z1∗ z1∗z1−B∗.. 

z2−z3−p∗pow( z2 , 2 . 0 8 ) /pow( z1 , 1 . 0 8 ) ) /J+theta ∗ block−>x [ n.. 

+1]∗ e r r e u r ; // dz2 / dt =...+P21∗ e r r e u r 

block−>xd [ 2 ] = (V∗ z3 /z1−Res∗z3−K1∗ z2∗ z3 / z1 ) /L+theta ∗ block−>x [ n.. 

+2]∗ e r r e u r ; // dz3 / dt =...+P31∗ e r r e u r 

/∗ D e f i n i t i o n o f t h e matrix Ab=(A( u )+b s t a r ) ∗/ 

Ab[0]=−Res/L ; // b ( 1 , 1 ) 

Ab[ 1 ] = (K1∗ z2∗z2−V∗ z2 ) /(L∗ z1∗ z1 ) +(3∗K2∗ z1∗ z1 +1.08∗p∗pow( z2 /z1 .. 

, 2 . 0 8 ) ) /J ; // b ( 2 , 1 ) 

Ab[ 2 ] = (K1∗ z2∗z3−V∗ z3 ) /(L∗ z1∗ z1 ) ; // b ( 3 , 1 ) 

Ab[3]=−K1/L ; 

158


C.4 Innovation Computational Functions C Code 

Ab[ 4 ] = ( (V−2∗K1∗ z2 ) /z1−Res ) /L−(B+2.08∗p∗pow( z2 /z1 , 1 . 0 8 ) ) /J ; //.. 

b ( 2 , 2 ) 

Ab[5]=−K1∗ z3 /(L∗ z1 ) ; // b ( 3 , 2 ) 

//Ab [ 6 ] = 0 ; 

Ab[7]= −1/J ; 

Ab[ 8 ] = ( (V−K1∗ z2 ) /z1−Res ) /L ; // b ( 3 , 3 ) 

/∗ Computation o f t h e R i c c a t i e q u a t i o n ∗/ 

block−>xd [ 3 ] = 2 ∗ (Ab [ 0 ] ∗ P11+Ab [ 3 ] ∗ P21 )−theta ∗P11∗P11+theta ∗q1 ; 

block−>xd [4]=Ab [ 1 ] ∗ P11+Ab [ 4 ] ∗ P21+Ab [ 7 ] ∗ P31+Ab [ 0 ] ∗ P21+Ab [ 3 ] ∗ .. 

P22−t heta ∗P11∗P21 ; 


P32−t heta ∗P11∗P31 ; 

block−>xd [ 6 ] = 2 ∗ (Ab [ 1 ] ∗ P21+Ab [ 4 ] ∗ P22+Ab [ 7 ] ∗ P32 )−theta ∗P21∗P21.. 

+pow( theta , 3 ) ∗q2 ; 


P32+Ab [ 7 ] ∗ P33−t heta ∗P31∗P21 ; 

block−>xd [ 8 ] = 2 ∗ (Ab [ 2 ] ∗ P31+Ab [ 5 ] ∗ P32+Ab [ 8 ] ∗ P33 )−theta ∗P31∗P31.. 

+pow( theta , 5 ) ∗q3 ; 

/∗ Computation o f t h e a d a p t a t i o n f u n c t i o n ∗/ 

SI=1/(1+exp(−Beta ∗( Inn−m) ) ) ; 

F0=(thetaxd [9]= SI ∗F0+lambda∗(1− SI ) ∗(1− theta ) ; } 

e l s e i f ( f l a g ==5) 

{/∗ ENDING ∗/} 

} 


In the code below, the vector containing the system output past values and, the system 

input past values are not updated at the same place. 

Let us suppose that we want to compute Id(t). The computation is done via a discrete 

process with sample time δ. Therefore we need d 

δ +1 values for the output signal (i.e. y(t−d), 

y(t − d + δ)),...,y(t)), and d 

δ values for the input signal1 (i.e. u(t − d),...,u(t − δ)) in order to 

compute a trajectory 2 . 

Therefore the computation of innovation at time t requires: 

− to update the vector of past values of y from [y(t−d−δ), ..., y(t−δ)] to [y(t−d), ..., y(t)], 

− to compute a prediction of the state trajectory with the help of the vectors [y(t − 

d), ..., y(t)] and [u(t − d), ..., u(t − δ)], 

1 Knowing u(t) is useless here since we don’t want to predict any trajectory for times higher than t. 

2 Remember that the initial point of this prediction is the estimated state at time t − d. It is fed to the 

function via a delay block. 

159



− to update the vector of past values of u from [u(t−d), ..., u(t−δ)] to [u(t−d+δ), ..., u(t)]. 

function name innovation number of zero crossing 

surfaces 

0 

implicit n initial discrete state ⊛ 



[] 

input event port size [1] initial firing vector [] 


initial continuous state [] time dependence n 

#include 

#include 

#include 

/∗ i n p u t = [Y;V; z ( t−d ) ] ∗/ 

void f u n c t i o n ( double ∗ , double ∗) ; 

Table C.3: Arguments of the Innovation Function. 

⊛ = a null vector of length 2*(d/Dt)+2, (i.e. 22). 

void i n n o v a t i o n ( s c i c o s b l o c k ∗ block , int f l a g ) 

{ 

double d=1,Dt =0.1; 

int n=10; /∗ ( d/Dt ) ; ∗/ /∗ Auto conversion d o u b l e −> i n t ∗/ 

int nz , i , j ; 

/∗ Remark t h a t t h e r e i s NO TEST: Dt has to d i v i d e d ! ! ∗/ 

nz=2∗n+2; 

if ( f l a g ==4) 

{/∗ INITIALISATION ∗/ 

/∗ Already done . I t i s i m p o s s i b l e to change t h e dimension o f .. 

t h e s t a t e space from here ∗/ 

} 

else if( f l a g ==1| f l a g ==6) 

{/∗ OUTPUT ∗/ 

block−>outptr [ 0 ] [ 0 ] = block−>z [ nz −1];} 

else if( f l a g ==2) 

{/∗ DERIVATIVE ∗/ 

/∗ d e l a y e d i n p u t ( index 2 ,3 ,4) ∗/ 

double I , wr=block−>i n p t r [ 0 ] [ 3 ] , Tl=block−>i n p t r [ 0 ] [ 4 ] ; 

if ( block−>i n p t r [ 0 ] [ 2 ] > 0 . 5 ) { I=block−>i n p t r [ 0 ] [ 2 ] ; } 

else{ I =0.5;} 

double Y=block−>i n p t r [ 0 ] [ 0 ] , V=block−>i n p t r [ 0 ] [ 1 ] ; 

160


double x [3]={ I , I ∗wr , Tl∗ I } , xh [ 4 ] = { 0 , 0 , 0 , 0 } ; 

/∗ t h o s e ZEROs w i l l be r e d e f i n e d l a t e r on ∗/ 

double preInn [2]={pow ( ( block−>z [0] − x [ 0 ] ) , 2 ) , 0 } ; 

double Inn =0; /∗ t h i s i s t h e INNOVATION ∗/ 

double hh=Dt ∗ 0 . 5 , h6=Dt / 6 ; 

double dx1 [ 3 ] , dx2 [ 3 ] , dx3 [ 3 ] , dx4 [ 3 ] ; 

int neq =3; /∗ number o f ODEs to s o l v e ∗/ 

/∗ Update o f t h e o utput s i g n a l s t a c k ∗/ 

for( i =1; iz [ i −1]=block−>z [ i ] ; } 

block−>z [ n]=Y; 


for( j =1; jz [ n+j ] ; 

xh [3]=V; 

/∗ BEGIN RK4 ∗/ 

for( i =0; i


z2=∗(xc+1) ; 

z3=∗(xc+2) ; 

V=∗(xc+3) ; 

∗( xcdot )=(V−Res∗z1−K1∗ z2 ) /L ; 

∗( xcdot +1)=(V∗ z2 /z1−Res∗z2−K1∗ z2∗ z2 / z1 ) /L+(K2∗ z1∗ z1∗z1−B∗z2−.. 

z3−p∗pow( z2 , 2 . 0 8 ) /pow( z1 , 1 . 0 8 ) ) /J ; 

∗( xcdot +2)=(V∗ z3 /z1−Res∗z3−K1∗ z2∗ z3 / z1 ) /L ; 

return ; } 

C.5 Ornstein-Ulhenbeck Process 


The facts compiled in this section are mainly taken from the book [28] and the article [59]. 

A note from S. Finch ([1]) and the book [100] were also convenient sources of information. 

Introductory books on probability and stochastic processes can also be useful (e.g. [29, 

49, 81]) 3 . 

A stochastic process represents the state of a system that depends both on time and on 

random events. It is represented as a collection of random variables indexed by the time. We 

denote it {Xt : t ≥ 0}. 

Definition 99 

− A Brownian motion or Wiener process with variance parameter σ 2 , starting 

at 0, is a stochastic process {Bt : t ≥ 0} taking values in R, and having the properties: 

1. B0 =0, 

2. for any t1


N(0,1) 

Normally distributed 

random generator 

zn Xn 

Xn = µXn−1 + σ � 1 − µ 2 zn 


Colored 

noise 

Figure C.1: Simulation of colored noise via block wise programing. 

The law of the random variable at time 0 is supposed the invariant probability associated 

to (C.1) such that Xt is a stationary process. 

Equation (C.1) can be rewritten: 

and then 

d � e ρt � ρt 

Xt = αe dBt, 

Xt = e −ρt 

� � t 

X0 + αe ρs � 

dBs . 

The stochastic process Xt is such that [28]: 

1. If X0 is a gaussian variable with zero mean and variance equals to α2 

2ρ , then X(t) is 

gaussian and stationary of covariance: 

2. X(t) is a markovian process. 

0 

E[X(t)X(s + t)] = α2 

2ρ e−rρ|s| . 

3. When X(0) = c, the stochastic law of X(t) is a normal law with mean e−ρtc, and 

variance α2 

� 

2ρ 1 − e2ρt � . 

As explained in [59], Part 3, equation (3.15), for a sample time∆ t, the exact updating 

formulas of the Ornstein-Uhlenbeck process is given by: 

where: 

− µ = e −ρ∆t , 

− γ 2 = � 1 − e −2ρ∆t� � α 2 

2µ 

� 

, 

X(t + ∆t) =X(t)µ + γzn 

− zn is the realization of a standard gaussian distribution (i.e. N(0, 1)). 

163 

(C.2)



Let us now rewrite equation (C.1) as in [100], with the help of the positive constant 

parameters β and σ: 

dXt = −βXtdt + σ � 2βdBt. 

Therfore, with X(0) = 0, the mean value of Xt is 0 and the term α2 

2ρ 

σ2 . 

of the variance, becomes 

In order to make a simulation of the Ornstein-Uhlenbeck process, we consider equation 

(C.2) and replace: 

− α by σ √ 2β, 

− ρ by β. 

The update equation is (with µ = e −β∆t ) 

X(t + ∆t) =X(t)µ + σ(1 − µ 2 )zn. 

The actual simulation is done according to the diagram of Figure C.1, with X(0) = 0, µ ∈]0; 1[ 

and σ > 0. Remark that σ is the asymptotic standard deviation of the variables X(t), t > 0. 

An important theorem due to J. L. Doob [48] ensures that such a process necessarily satisfies a linear stochastic 

differential equation identical to the one used in the definition above. 

164


References 

[1] http://algo.inria.fr/bsolve/, 2010. 162 

[2] http://maxima.sourceforge.net/, 2010. 87 

[3] http://www.lar.deis.unibo.it/people/gpalli, 2010. 78 

[4] www.comedi.org, 2010. 77 

[5] www.rtai.org, 2010. 77 

[6] www.scicoslab.org, 2010. 77 

[7] www.scicos.org, 2010. 

[8] www.scilab.org, 2010. 77 

[9] R. Abraham and J. Robbin. Transversal Mappings and Flows. W. A. Benjamin, inc., 

1967. 13 

[10] I. Abuhadrous. Onboard real time system for 3D localization and modelling by multisensor 

data fusion. PhD thesis, Mines ParisTech (ENSMP), 2005. 103 

[11] J. H. Ahrens and H. K. Khalil. High-gain observers in the presence of measurement 

noise: A switched-gain approach. Automatica, 45:936–943, 2009. 27, 29, 116 

[12] M. Alamir. Nonlinear Observers and Applications, volume 363 of LNCIS, chapter Nonlinear 

Moving Horizon Observers: Theory and Real-Time Implementation. Springer, 

2007. 3, 40 

[13] H. Souley Ali, M. Zasadzinski, H. Rafaralahy, and M. Darouach. Robust H∞ reduced 

order filtering for unertain bilinear systems. Automatica, 42:405–415, 2005. 32 

[14] S. Ammar. Observability and observateur under sampling. International Journal of 

Control, 9:1039–1045, 2006. 104 

[15] S. Ammar and J-C. Vivalda. On the preservation of observability under sampling. 

Systems and control letters, 52:7–15, 2003. 103, 104 

[16] V. Andrieu, L. Praly, and A. Astolfi. High-gain observers with updated gain and 

homogeneous correction terms. Automatica, 45(2), 2009. 25, 26 

165


REFERENCES 

[17] J. S. Baras, A. Bensoussan, and M. R. James. Dynamic observers as asymptotic limits 

of recursive filters: Special cases. SIAM Journal of Applied Mathematics, 48(5):1147– 

1158, 1988. 38 

[18] Y. Becis-Aubry, M. Boutayeb, and M. Darouach. State estimation in the presence of 

bounded disturbances. Automatica, 44:1867–1873, 2008. 33 

[19] G. Besançon. Nonlinear observers and Applications, volume 363 of LNCIS, chapter An 

Overview on Observer Tools for Nonlinear Systems. Springer, 2007. 13, 93, 116 

[20] G. Besançon, G. Bornard, and H. Hammouri. Observer synthesis for a class of nonlinear 

control systems. European Journal of Control, 2:176–192, 2996. 93 

[21] N. Boizot and E. Busvelle. Nonlinear Observers and Applications, volume 363 of LNCIS, 

chapter Adaptive-gain Observers and Applications. Springer, 2007. xx, 2, 4, 7, 15, 71 

[22] N. Boizot, E. Busvelle, and J-P. Gauthier. An adaptive high-gain observer for nonlinear 

systems. Automatica (Accepted for publication). xx, 44, 71 

[23] N. Boizot, E. Busvelle, and J-P. Gauthier. Adaptive-gain in extended kalman filtering 

(oral presentation). In International conference on Differential equations and Topology, 

MSU, Moscow, Fed. of Russia, 2008. xx, 44 

[24] N. Boizot, E. Busvelle, and J-P. Gauthier. Adaptive-gain extended kalman filter: Extension 

to the continuous-discrete case. In Proceedings of the 10th European Control 

Conference (ECC09), 2009. 106 

[25] N. Boizot, E. Busvelle, J-P. Gauthier, and J. Sachau. Adaptive-gain extended kalman 

filter: Application to a series connected dc motor. In Conference on Systems and 

Control, Marrakech, Morocco, May 16-18 2007. xx, 76 

[26] N. Boizot, E. Busvelle, and J. Sachau. High-gain observers and kalman filtering in hard 

realtime. In 9th Real-Time Linux Workshop, Institute for Measurement Technology, 

Johannes Kepler University of Linz, November 2-4 2007. xx 

[27] G. Bornard. Observability and high gain observer for multi-output systems. 

In 28 th Summer School in Automatics, Gipsa Lab-Grenoble, France (www.gipsalab.inpg.fr/summerschool/session/s 

28 uk.php), Lecture 3, 2007. 93 

[28] N. Bouleau. Processus Stochastiques et Applications (Nouvelle Edition). Hermann, 

2000. 162, 163 

[29] N. Bouleau. Probabilités de l’ingénieur, Variables Aléatoires et simulation (Nouvelle 

Edition). Hermann, 2002. 162 

[30] M. Boutayeb and D. Aubry. A strong tracking extended kalman observer for nonlinear 

discrete-time systems. IEEE Transactions on Automatic Control, 44(8):1550–1556, 

1999. 32, 33, 34 

166


REFERENCES 

[31] M. Boutayeb, H. Rafaralahy, and M. Darouach. Convergence analysis of the extended 

kalman filter as an observer for nonlinear discrete-time systems. In Proceedings of the 

34 th Conference on Decision and Control, 1995. 32, 33 

[32] K. J. Bradshaw, I. D. Reid, and D. W. Murray. The active recovery of 3d motion 

trajectories and their use in prediction. IEEE Transactions on Pattern Analysis and 

Machine Intelligence, 19(3):219–233, 1997. 32 

[33] H. Brezis. Analyse Fonctionnelle: Théorie et applications. Dunod, 1999. 125, 126 

[34] R. Bucher, S. Mannori, and T. Netter. RTAI-Lab tutorial: Scilab, Comedi, and realtime 

control, 2008. 77, 78, 79, 84 

[35] E. Bullinger and F. Allgőwer. An adaptive high-gain observer for nonlinear systems. 

In Proceedings of the 36 th Conference on Decision and Control, 1997. 24, 26 

[36] E. Bullinger and F. Allgőwer. Adaptive λ-tracking for nonlinear higher relative degree 

systems. Automatica, 41:1191–1200, 2005. 24 

[37] F. Busse and J. How. Demonstration of adaptive extended kalman filter for low earth 

orbit formation estimation using cdgps. In Institute of Navigation GPS Meeting, 2002. 

22, 32 

[38] E. Busvelle and J-P. Gauthier. High-gain and Non High-gain Observer for Nonlinear 

Systems. World Scientific, 2002. 11, 17, 18, 22, 35, 36, 38, 40, 49, 50, 52, 63, 65, 67, 

102, 151 

[39] E. Busvelle and J-P. Gauthier. On determining unknown functions in differential systems, 

with an application to biological reactors. COCV, 9:509, 2003. 

[40] E. Busvelle and J-P. Gauthier. Observation and identification tools for nonlinear systems. 

application to a fluid catalytic cracker. International Journal of Control, 78(3), 

2005. xi, 11, 14, 15, 36 

[41] S. L. Campbell, J-P. Chancelier, and R. Nikoukhah. Modeling and Simulation in Scilab/Scicos. 

Springer, 2006. 77, 86, 155 

[42] H. Carvalho, P. Del Moral, A. Monin, and G. Salut. Optimal nonlinear filtering in 

gps/ins integration. IEEE Transactions on Aerospace and Electronic Systems, 33(3), 

1997. 7 

[43] C. K. Chui and G. Chen. Kalman Filtering with Real-Time Applications, 3 rd Edition. 

Springer, 1998. x, 5, 7, 30, 61 

[44] P. Connolly. Hyperion prop talk. Technical report, http://aircraftworld.com/prod 

datasheets/hp/emeter/hp-proptalk.htm. 79, 81 

[45] B. d’Andrea Novel and M. Cohen de Lara. Cours d’automatique : commande linéaire 

des systèmes dynamiques. Masson, 2000. 5, 7 

167


REFERENCES 

[46] F. Deza, E. Busvelle, and J-P. Gauthier. Exponentially converging observers for distillation 

columns and internal stability of the dynamic output feedback. Chemical 

Engineering Science, 47(15/16), 1992. 20 

[47] F. Deza, E. Busvelle, J-P. Gauthier, and D. Rakotopara. High-gain estimation for 

nonlinear systems. Systems and control letters, 18:295–299, 1992. x, 11, 20, 21, 22, 38, 

40 

[48] J. L. Doob. The brownian movement and stochastic equations. Annals of Math., 

43:351–369, 1942. 164 

[49] R. Durett. Stochastic Calculus, A practical introduction. CRC Press, 1996. 162 

[50] F. Esfandiari and H. K. Khalil. Output feedback stabilization of fully linearizable 

systems. International Journal of Control, 56:1007–1037, 1992. 20, 28 

[51] M. Farza, M. M’Saad, and L. Rossignol. Observer design for a class of mimo nonlinear 

systems. Automatica, 40:135–143, 2004. 93 

[52] J-P. Gauthier and G. Bornard. Observability for any u(t) of a class of nonlinear systems. 

IEEE Transactions on Automatic Control, 26(4):922–926, 1981. 11 

[53] J-P. Gauthier, H. Hammouri, and I. Kupka. Observers for nonlinear systems. In IEEE 

CDC conference, pages 1483–1489, 1991. 11 

[54] J-P. Gauthier, H. Hammouri, and S. Othman. A simple observer for nonlinear systems 

applications to bioreactors. IEEE Transactions on Automatic Control, 37(6):875–880, 

1992. x, 11, 20 

[55] J-P. Gauthier and I. Kupka. Observability and observers for nonlinear systems. SIAM 

Journal on Control, 32(4):975–994, 1994. 

[56] J-P. Gauthier and I. Kupka. Observability with more outputs than inputs. Mathematische 

Zeitschrift, 223:47–78, 1996. 11 

[57] J-P. Gauthier and I. Kupka. Deterministic Observation Theory and Applications. Cambridge 

University Press, 2001. x, xi, 4, 11, 14, 15, 17, 21, 40, 42, 47, 93, 96, 98, 115, 

116, 125, 128, 131, 144 

[58] A. Gelb, editor. Applied Optimal Estimation. The MIT Press, 1974. x, xvii, 3, 5, 6, 20, 

30 

[59] D. T. Gillespie. Exact numerical simulation of the ornstein-uhlenbeck process and its 

integral. Physical Review E, 54(2):2084–2091, 1996. 162, 163 

[60] M. S. Grewal, L. Weill, and A. P. Andrews. Global Positioning Systems, Inertial Navigation 

and Integration. John Wiley and Sons, 2007. 20, 22, 30 

[61] L. Z. Guo and Q. M. Zhu. A fast convergent extended kalman observer for nonlinear 

discrete-time systems. International of Systems Science, 33(13):1051–1058, 2002. 34 

168


REFERENCES 

[62] T. Hagglund. New Estimation Techniques for Adaptive Control. PhD thesis, Lund 

Institue of Technology, 1985. 31 

[63] H. Hammouri. Nonlinear Observers and Applications, volume 363 of LNCIS, chapter 

Uniform Observability and Observer Synthesis. Springer, 2007. 93 

[64] S. Haugwitz, P. Hagander, and T. Norén. modeling and control of a novel heat exchange 

reactor, the open plate reactor. Control Engineering Practices, 15:779–792, 2007. xi, 

22 

[65] R. Hermann and A. J. Krener. Nonlinear controlability and observability. IEEE Transactions 

on Automatic Control, AC-22:728–740, 1977. 11 

[66] M. W. Hirsch. Differential Topology. Springer-Verlag, 1972. 12 

[67] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985. 

108, 132 

[68] A. H. Jazwinski. Stochastic Processes and Filtering Theory. Dover Publications, INC, 

1970. x 

[69] L. Jetto and S. Longhi. Development and experimental validation of an adaptive 

extended kalman filter for the localization of mobile robots. IEEE Transactions on 

Robotics and Automation, 15(2):219–229, 1999. 32 

[70] P. Jouan and J-P. Gauthier. Finite singularities of nonlinear systems. output stabilization, 

observability and observers. Journal of Dynamical and Control systems, 

2(2):255–288, 1996. 11 

[71] S. Julier and J. K. Ulhmann. A new extension of the kalman filter to nonlinear systems. 

In Signal processing, sensor fusion, and target recognition VI; Proceedings of the 

Conference, Orlando, pages 182–193, 1997. 116 

[72] D. J. Jwo and F. Chang. Advanced Intelligent Computing Theories and Applications. 

With Aspects of Theoretical and Methodological Issues, chapter A Fuzzy Adaptive Fading 

Kalman Filter for GPS Navigation, pages 820–831. LNCIS. Springer, 2007. 23 

[73] Radhakrishnan K. and Hindmarsh A. C. Description and use of lsode, the livermore 

solver for ordinary differential equations. Technical Report UCRL-ID-113855, LLNL, 

1993. 87 

[74] T. Kailath. Linear Systems, volume xxi. Prentice-Hall, New Jersey, 1980. 3 

[75] R. E. Kalman. A new approach to linear filtering. Transactions of the ASME - Journal 

of Basic Engineering, 82 (Series D):35–45, 1960. 5 

[76] R. E. Kalman and B. S. Bucy. New results in linear filtering and prediction theory. 

Journal of Basic Engineering, 83:95–108, 1961. 5 

[77] P. C. Krause, O. Wasynczuk, and S. D. Sudhoff. Analysis of Electric Machinery and 

Drive Systems, 2nd Edition. Wiley-interscience, 2002. 57 

169


REFERENCES 

[78] P. Krishnamurthy and F. Khorrami. Dynamic high-gain scaling: State and output 

feedback with application to systems with iss appended dynamics driven by all states. 

IEEE Transactions on Automatic Control, 49(12):2219–2239, 2004. 25 

[79] P. Krishnamurthy, F. Khorrami, and R. S. Chandra. Global high-gain based observer 

and backstepping controler for generalized output-feedback canonical form. IEEE 

Transactions on Automatic Control, 48(12), 2003. 25 

[80] H. J. Kushner. Approximations to optimal nonlinear filters. IEEE Transactions on 

Automatic Control AC, 12(5), 1967. 7 

[81] G. F. Lawler. Introduction to Stochastic Processes, Second Edition. Chapman and 

Hall/CRC, 2006. 162 

[82] D. S. Lemons and A. Gythiel. Paul langevin’s 1908 paper “on the theory of brownian 

motion”. American Journal of Physics, 65(11), 1997. 162 

[83] V. Lippiello, B. Siciliano, and L. Villani. Adaptive extended kalman filtering for visual 

motion estimation of 3d objects. Control Engineering Practices, 15:123–134, 2007. 32 

[84] F. L. Liu, M. Farza, M. M’Saad, and H. Hammouri. High gain observer based on 

coupled structures. In Conference on Systems and Control, Marrakech, Morocco, 2007. 

93 

[85] L. Ljung and T. Söderström. Theory and Practice of Recursive Identification. The MIT 

Press, 1983. 83 

[86] D. G. Luenberger. Observers for multivariable systems. IEEE Transactions on Automatic 

Control, 11:190–197, 1966. 5 

[87] P. Martin and P. Rouchon. Two remarks on induction motors. In Symposium on 

Control, Optimization and Supervision, 1996. 57 

[88] The Mathworks, www.mathworks.com. Optimization Toolbox for Use with Matlab, 

User’s Guide. 83 

[89] P. S. Maybeck. Stochastic Models, Estimation, and Control, volume 1. Academic Press, 

1979. 30 

[90] P. S. Maybeck. Stochastic Models, Estimation, and Control, volume 2. Academic Press, 

1982. 30, 32 

[91] O. Mazenc and B. Olivier. Interval observers for planar systems with complex poles. 

In European Control Conference, 2009. 4 

[92] R. K. Mehra. Approaches to adaptive filtering. IEEE Transactions on Automatic 

Control, 17(5):693– 698, 1972. 30, 32 

[93] S. Mehta and J. Chiasson. Nonlinear control of a series dc motor: Theory and experiment. 

IEEE Transactions on Industrial Electronics, 45(1), 1998. 57, 58 

170


REFERENCES 

[94] C. Melchiori and G. Palli. A realtime simulation environment for rapid prototyping of 

digital control systems and education. Automazione e Strumentazione, Febbraio 2007. 

78 

[95] H. Michalska and D. Q. Mayne. Moving horizon observers and observer based control. 

IEEE Transactions on Automatic Control, 40:995–1006, 1995. 40 

[96] A. H. Mohamed and K. P. Schwarz. Adaptive kalman filtering for INS/GPS. Journal 

of Geodesy, 73:193–203, 1999. 22, 30 

[97] L. La Moyne, L. L. Porter, and K. M. Passino. Genetic adaptive observers. Engineering 

Applications of Artificial intelligence, 8(3):261–269, 1995. 23 

[98] K. S. Narendra and A. M. Annaswamy. Stable Adaptive Systems. Prentice-Hall, New 

Jersey, 1989. 3 

[99] M. G. Pappas and J. E. Doss. Design of a parallel kalman filter with variable forgetting 

factors. In American Control Conference, pages 2368–2372, 1988. 30 

[100] E. Pardoux. Ecole d’été de Probabilités de Saint-Flour XIX, volume 1464 of LNM, chapter 

Filtrage non linéaire et Equations aux Dérivées Partielles Stochastiques Associées. 

Springer-Verlag, 1989. 29, 162, 164 

[101] J. Picard. Efficiency of the extended kalman filter for nonlinear systems with small 

noise. SIAM Journal of Applied Mathematics, 51(3):843–885, 1991. x, 7, 29, 38 

[102] S. H. Pourtakdoust and H. Ghanbarpour. An adaptive unscented kalman filter for 

quaternion-based orientation estimation in low-cost ahrs. International Journal of Aircraft 

Engineering and Aerospace Technology, 79(5):485–493, 2007. 32 

[103] L. Praly. Asymptotic stabilization via output feedback for lower triangular systems 

with output dependent incermental rate. IEEE Transactions on Automatic Control, 

48(6), 2003. 25 

[104] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes: 

The Art of Scientific Computing. Cambridge University Press, 2007. 83, 87 

[105] A. Rapaport and D.Dochain. Interval observers for biochemical processes with uncertain 

kinetics and inputs. Mathematical Biosciences, 193(2):235–253, 2005. 4 

[106] H. Reinhard. Equations Différentielles. Fondements et Applications. Gauthier-Villars, 

1982. 122 

[107] W. Rudin. Functional Analysis (2 nde edition). McGraw-Hill Science Engineering, 1991. 

125, 126 

[108] S. Sarkka. On unscented kalman filtering for state estimation of continuous-time nonlinear 

systems. IEEE Transactions on Automatic Control, 52(9):1631–1641, 2007. 7, 

117 

171


REFERENCES 

[109] S. Stubberud, R. Lobbia, and M. Owen. An adaptive extended kalman filter using 

artificial neural networks. The international journal on smart systems design, 1:207– 

221, 1998. 23 

[110] H. J. Sussmann. Single-input observability of continuous-tiime systems. Mathematical 

systems Theory, 12:371–393, 1979. 11 

[111] A. Tornambè. High-gain observers for nonlinear systems. International Journal of 

Systems Science, 13(4):1475–1489, 1992. 20, 23, 24 

[112] J. K. Ulhmann. Simultaneous map building and localization for real-time applications. 

Technical report, University of Oxford, 1994. 117 

[113] F. Viel. Stabilité des systèmes non linéaires contrôlés par retour d’état estimé. Application 

aux réacteurs de polymérisation et aux colonnes à distiller. PhD thesis, Université 

de Rouen, Mont-Saint-Aignan, FRANCE, 1994. 4, 22, 93 

[114] B. Vik, A. Shiriaev, and Thor I. Fossen. New Directions in Nonlinear Observer Design, 

volume 244 of LNICS, chapter Nonlinear Observer Design for Integration of DGPS and 

INS. Springer, 1999. 103 

[115] K. C. yu, N. R. Watson, and J. Arrillaga. An adaptive kalman filter for dynamic 

harmonic state estimation and harmonic injection tracking. Transactions on Power 

Delivery, 20(2), 2005. 22, 23, 30 

[116] A. Zemouche and M. Boutayeb. Observer synthesis method for a class of nonlinear 

discrete-time systems with extension to observer-based control. In Proceedings of the 

17 th World Congress of the International Federation of Automatic Control, 2008. 32 

[117] A. Zemouche, M. Boutayeb, and G. Iulia Bara. Observer design for nonlinear systems: 

An approach based on the differential mean value theorem. In Proceedings of the 44 th 

Conference on Decision and Control and the European Control Conference, 2005. 32 

172

Adaptative high-gain extended Kalman filter and applications

Create successful ePaper yourself

Delete template?

Save as template?