24.06.2013 Views

PhD thesis - IAS

PhD thesis - IAS

PhD thesis - IAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

N o d’ordre: 9910<br />

THÈSE<br />

Présentée pour obtenir<br />

LE GRADE DE DOCTEUR EN SCIENCES<br />

DE L’UNIVERSITÉ PARIS-SUD XI<br />

Spécialité: Mathématiques<br />

par<br />

Abed Bounemoura<br />

Stabilité et instabilité des systèmes<br />

hamiltoniens presque-intégrables.<br />

Soutenue le 22 Septembre 2010 devant la Commission d’examen:<br />

M. Patrick BERNARD<br />

M. Håkan EL<strong>IAS</strong>SON<br />

M. Pierre LOCHAK<br />

M. Jean-Pierre MARCO (Directeur de thèse)<br />

M. Laurent NIEDERMAN (Directeur de thèse)<br />

M. Jean-Christophe YOCCOZ (Président du jury)<br />

Rapporteurs:<br />

M. Patrick BERNARD<br />

M. Vadim KALOSHIN


Thèse préparée au<br />

Département de Mathématiques d’Orsay<br />

Laboratoire de Mathématiques (UMR 8628), Bât. 425<br />

Université Paris-Sud 11<br />

91 405 Orsay CEDEX


Résumé<br />

Dans cette thèse, on s’intéresse à diverses questions concernant la stabilité et l’instabilité<br />

des systèmes hamiltoniens presque-intégrables. Elle se divise en quatre parties et huit chapitres.<br />

Dans une première partie, on donne une introduction informelle aux systèmes hamiltoniens<br />

et à la théorie des perturbations des systèmes hamiltoniens intégrables dans le premier chapitre,<br />

puis dans le second chapitre, on expose les résultats présentés dans cette thèse.<br />

Une seconde partie est consacrée à des résultats de stabilité. Dans le troisième chapitre,<br />

on donne une nouvelle preuve du théorème de stabilité exponentielle de Nekhoroshev dans le<br />

cas générique pour un système analytique. Elle n’utilise que des compositions de moyennisations<br />

périodiques, et elle évite donc le fameux problème des petits diviseurs. Dans le quatrième<br />

chapitre, on utilise cette approche pour en déduire des nouveaux résultats de stabilité exponentielle<br />

et super-exponentielle au voisinage des points fixes elliptiques, des tores lagrangiens<br />

invariants quasi-périodiques et plus généralement des tores invariants quasi-périodiques linéairement<br />

stables, isotropes et réductibles. Enfin, dans le cinquième chapitre, on établit un résultat<br />

de stabilité polynomiale si le système est seulement de différentiabilité finie, dans le cas où la<br />

partie intégrable est quasi-convexe.<br />

Dans une troisième partie, on étudie la frontière entre la stabilité et l’instabilité. Dans le<br />

sixième chapitre, pour un système quasi-convexe analytique ou de classe Gevrey, on améliore<br />

l’exposant de stabilité en étudiant la géométrie des résonances simples. On obtient ainsi un<br />

temps de stabilité encore plus proche des temps d’instabilité connus, et qui doit certainement<br />

être optimal.<br />

Enfin, dans une quatrième partie, on s’intéresse à la construction de certains exemples<br />

d’instabilité. Dans un septième chapitre, on construit un nouvel exemple d’un système a priori<br />

instable qui possède une solution qui dérive avec un temps optimal. Notre approche est basée sur<br />

la dynamique symbolique engendrée par l’intersection transverse des variétés stable et instable<br />

d’une variété normalement hyperbolique. Dans le huitième et dernier chapitre, on construit<br />

également un exemple de système presque-intégrable, dont la taille de la perturbation ne tend<br />

vers zéro que lorsque le nombre de degrés de libertés tend vers l’infini, avec une solution qui<br />

dérive en temps polynomial. Cela donne en particulier de nouvelles contraintes sur le seuil de<br />

validité des résultats de stabilité exponentielle.<br />

La première partie est écrite en français, les autres en anglais.<br />

Mots-clefs : Systèmes dynamiques, Systèmes hamiltoniens, Théorie des perturbations, Théorie<br />

de Nekhoroshev, Diffusion d’Arnold.


Stability and instability of near-integrable Hamiltonian systems.<br />

Abstract<br />

In this <strong>thesis</strong>, we study various questions concerning the stability and instability of nearintegrable<br />

Hamiltonian systems. It is divided into four parts and eight chapters.<br />

In a first part, we give an informal introduction to Hamiltonian systems and to the perturbation<br />

theory of integrable Hamiltonian systems in the first chapter, and then, in the second<br />

chapter, we state our results.<br />

A second part is devoted to stability results. In the third chapter, we give a new proof of the<br />

exponential stability theorem of Nekhoroshev in the generic case for an analytic system. Our<br />

method uses only composition of periodic averaging, and therefore it avoids the small divisors<br />

problem. Then, in the fourth chapter, we take advantage of this approach to obtain new<br />

results of exponential and super-exponential stability in the neighbourhood of elliptic fixed<br />

points, invariant Lagrangian quasi-periodic tori and more generally invariant linearly stable<br />

quasi-periodic tori, which are isotropic and reducible. In the fifth chapter, for a quasi-convex<br />

integrable Hamiltonian system, we also prove a result of polynomial stability in the case where<br />

the system is only finitely differentiable.<br />

A third part lies between stability and instability. In the sixth chapter, for a quasi-convex<br />

system which is analytic or Gevrey, we improve the stability exponent by studying the geometry<br />

of simple resonances. Thus we obtain a time of stability which is closer to the known instability<br />

times, and which is certainly optimal.<br />

In the fourth part, we will construct examples of unstable Hamiltonian systems. First,<br />

in the seventh chapter, we give a new example of an a priori unstable system which has a<br />

drifting orbit with an optimal time of instability. Our method uses the symbolic dynamics<br />

created by the transverse intersection between the stable and unstable manifolds of a normally<br />

hyperbolic invariant manifold. In the eighth and last chapter, we also construct an example<br />

of a near-integrable Hamiltonian system, for which the size of the perturbation goes to zero<br />

only when the number of degrees of freedom goes to infinity, and which has an orbit drifting<br />

in a polynomial time. In particular, this gives a new constraint on the threshold of validity for<br />

exponential stability results.<br />

The first part is written in French, and the others in English.<br />

Keywords : Dynamical Systems, Hamiltonian Systems, Perturbation Theory, Nekhoroshev<br />

Theory, Arnold Diffusion.


Béton style ...<br />

Quelque soit l’âge, le sexe, le niveau d’étude<br />

La vie fait de nous ce qu’on est, il y’a pas de quoi garder rancune,<br />

Être déboussolé ou plein d’amertume,<br />

Tant qu’on a la volonté, la liberté de choisir, pas d’inquiétude,<br />

Nous apprenons jour et nuit la vie entre ciel et terre,<br />

On peut répondre non et oui,<br />

Aimer mère et père,<br />

Évaluer le pour et le contre,<br />

Se retrouver haut et bas,<br />

Perdre ou retrouver son compte lorsqu’on est entre le bien et le mal,<br />

Ce que la vie offre à la vie<br />

Ne nourrit pas tout le temps l’espoir,<br />

On croit aux amours, tolérances, douceurs et joies,<br />

Y’a aussi le côté morose,<br />

Haine, humeur noire, angoisse, fureur, crainte, tristesse, et désespoir,<br />

Chacun choisit ses valeurs, sa manière d’être et de voir,<br />

Son code moral, son mode de vie, son clan, son devoir,<br />

J’dis ces trucs, histoire de faire savoir<br />

Ce que c’est de ne savoir que faire à part se servir de son savoir.<br />

Le Rat Luciano, Fonky Family. Entre deux feux, Art de rue, 2001.


À ma mère, à toute ma famille


Remerciements<br />

À mes deux excellents directeurs de thèse, Laurent Niederman et Jean-Pierre Marco,<br />

sans qui cette thèse n’existerait pas. Cette page serait beaucoup trop courte pour tout<br />

les remerciements professionnels et personnels que je leur dois. Ce fût un immense plaisir<br />

de travailler avec eux ces trois dernières années, et j’espère bien que cela continuera.<br />

À Patrick Bernard et Vadim Kaloshin, pour l’intérêt qu’ils ont portés à ce travail<br />

en leur qualité de rapporteur. À Pierre Lochak, pour être à la source de la plupart des<br />

mathématiques de cette thèse. À Håkan Eliasson et Jean-Christophe Yoccoz, pour le<br />

très grand honneur qu’ils me font en étant présents dans ce jury.<br />

À Frédéric Le Roux, pour m’avoir initié avec beaucoup de gentillesse et d’efficacité<br />

au monde de la recherche mathématiques; je lui dois entièrement le livre [Bou08] issu<br />

d’un stage de Master effectué sous sa direction, et que j’ai rédigé durant ma première<br />

année de thèse. Merci aussi à Étienne Ghys d’avoir proposé que ce travail soit publié.<br />

Au Laboratoire Mathématiques d’Orsay, à Frédéric Le Roux, François Béguin, Sylvain<br />

Crovisier, Pierre Pansu, David Harari et Frédéric Paulin, mais aussi aux meilleurs<br />

secrétaires du monde, Valérie Blandin-Lavigne et Martine Justin.<br />

À l’Institut Mathématiques de Jussieu, à Jacques Féjoz, Laurent Lazzarini, Maylis<br />

Irigoyen et à tous les participants du séminaire "Géométrie Hamiltonienne", à Marcelline<br />

Prosper-Cojande pour sa compétence et sa gentillesse.<br />

À l’Observatoire de Paris, à tous les membres de l’équipe ASD de l’IMCCE, avec des<br />

remerciements très spéciaux aux "deux Alain", Alain Albouy et Alain Chenciner, mais<br />

aussi à Phillipe Robutel, Jacques Laskar et David Sauzin.<br />

À Laurent Niederman, Jean-Pierre Marco, Alain Chenciner, Alain Albouy, Frédéric<br />

Le Roux, Tere Seara, Ernest Fontich, Enrique Pujals mais aussi Pierre Berger et Alfonso<br />

Sorrentino, pour divers soutiens. À El Houcein El Abdalaoui, Jean-François Quint, Sylvain<br />

Crovisier et Andrea Venturelli pour leur invitation à des séminaires "extérieurs" et<br />

leur excellent accueil. À Bassam Fayad, son cours de Master sur la théorie KAM m’a<br />

orienté vers les systèmes hamiltoniens. À Claudio Murolo, pour avoir été le premier à<br />

croire en mes capacités mathématiques et pour m’avoir encouragé dans cette voie.<br />

À tous mes amis doctorants et jeunes docteurs, qu’ils viennent d’Orsay, de<br />

Chevaleret-Jussieu, de l’Observatoire de Paris ou des quatre coins du monde. C’est<br />

le moment difficile et injuste où je ne citerai pas de noms alors que je leur dois beaucoup;<br />

j’ai eu la chance d’être entouré de brillants (apprentis) mathématiciens à qui je<br />

pouvais poser mes questions, ainsi que d’un très grand expert en Latex. J’adresserais<br />

mes remerciements à chacun de vive voix.<br />

Aux Mathématiques pour ce qu’elles m’ont apporté, à savoir une vie agréable ces<br />

trois dernières années avec la possibilité d’assouvir ma passion pour les voyages. En<br />

retour, j’espère pouvoir un jour apporter quelque chose aux Mathématiques.<br />

À tous mes amis parisiens et amies parisiennes.<br />

À tous mes potes du quartier, je n’oublie pas d’où je viens.<br />

À celle dont je ne connais pas le nom, avec un brin d’optimisme elle le saura un jour.


Table des matières<br />

I Introduction, résultats et questions 15<br />

1 Introduction 17<br />

1.1 Systèmes hamiltoniens . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />

1.2 Systèmes hamiltoniens intégrables . . . . . . . . . . . . . . . . . . . . . . 22<br />

1.3 Théorie classique des perturbations . . . . . . . . . . . . . . . . . . . . . 27<br />

1.4 Théorèmes de stabilité . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

1.5 Exemples d’instabilité . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />

1.6 Au voisinage d’un tore invariant linéairement stable . . . . . . . . . . . . 46<br />

2 Résultats et questions 51<br />

2.1 Résultats de stabilité . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />

2.2 De la stabilité à l’instabilité . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

2.3 Résultats d’instabilité . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />

II Results of stability 57<br />

3 Generic exponential stability without small divisors 59<br />

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />

3.2 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

3.3 Proof of Theorem 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

3.A Proof of the normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . 82<br />

3.B SDM functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91<br />

4 Generic super-exponential stability for invariant tori 99<br />

4.1 Introduction and main results . . . . . . . . . . . . . . . . . . . . . . . . 99<br />

4.2 Proof of Theorem 4.1 and Theorem 4.2 . . . . . . . . . . . . . . . . . . . 104<br />

4.3 Further results and comments . . . . . . . . . . . . . . . . . . . . . . . . 108<br />

4.A Generic assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110<br />

5 Polynomial stability for C k quasi-convex Hamiltonian systems 115<br />

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115


5.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />

5.3 Analytical part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118<br />

5.4 Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />

III From stability to instability 129<br />

6 Improved exponential stability for quasi-convex Hamiltonian systems131<br />

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131<br />

6.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133<br />

6.3 The analytic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

6.4 The Gevrey case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143<br />

IV Results of instability 149<br />

7 Optimal time of instability for a priori unstable Hamiltonian systems151<br />

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151<br />

7.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153<br />

7.3 Construction of the perturbation . . . . . . . . . . . . . . . . . . . . . . 155<br />

7.4 Construction of a symbolic dynamic . . . . . . . . . . . . . . . . . . . . . 159<br />

7.5 Construction of a pseudo-orbit . . . . . . . . . . . . . . . . . . . . . . . . 176<br />

7.6 Proof of Theorem 7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185<br />

7.A Time-energy coordinates for the pendulum . . . . . . . . . . . . . . . . . 191<br />

8 Time of instability for high-dimensional Hamiltonian systems 193<br />

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193<br />

8.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />

8.3 Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196<br />

8.A Gevrey functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207<br />

Références 209


Première partie<br />

Introduction, résultats et questions<br />

Sommaire<br />

1 Introduction 17<br />

1.1 Systèmes hamiltoniens . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />

1.1.1 Équations de la mécanique classique . . . . . . . . . . . . . . 17<br />

1.1.2 Systèmes hamiltoniens sur une variété . . . . . . . . . . . . . 19<br />

1.1.3 Quelques propriétés générales . . . . . . . . . . . . . . . . . . 21<br />

1.2 Systèmes hamiltoniens intégrables . . . . . . . . . . . . . . . . . . . . 22<br />

1.2.1 Structure des systèmes intégrables . . . . . . . . . . . . . . . 23<br />

1.2.2 Dynamique des systèmes intégrables . . . . . . . . . . . . . . 26<br />

1.3 Théorie classique des perturbations . . . . . . . . . . . . . . . . . . . 27<br />

1.3.1 Principe de moyennisation . . . . . . . . . . . . . . . . . . . . 28<br />

1.3.2 Théorie des formes normales . . . . . . . . . . . . . . . . . . . 29<br />

1.4 Théorèmes de stabilité . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

1.4.1 Théorie KAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

1.4.2 Théorie de Nekhoroshev . . . . . . . . . . . . . . . . . . . . . 36<br />

1.5 Exemples d’instabilité . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />

1.5.1 Le mécanisme d’Arnold . . . . . . . . . . . . . . . . . . . . . 40<br />

1.5.2 Résonances simples et résonances multiples . . . . . . . . . . 43<br />

1.5.3 Temps d’instabilité . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

1.6 Au voisinage d’un tore invariant linéairement stable . . . . . . . . . . 46<br />

1.6.1 Au voisinage d’un tore lagrangien quasi-périodique . . . . . . 46<br />

1.6.2 Au voisinage d’un point fixe elliptique . . . . . . . . . . . . . 48<br />

2 Résultats et questions 51<br />

2.1 Résultats de stabilité . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />

2.1.1 Cas générique . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />

2.1.2 Cas quasi-convexe . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

2.2 De la stabilité à l’instabilité . . . . . . . . . . . . . . . . . . . . . . . 54<br />

2.2.1 Cas analytique . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

2.2.2 Cas Gevrey . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

2.3 Résultats d’instabilité . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />

2.3.1 Cas a priori instable . . . . . . . . . . . . . . . . . . . . . . . 55<br />

2.3.2 Cas non perturbatif . . . . . . . . . . . . . . . . . . . . . . . 56<br />

15


1.1 - Systèmes hamiltoniens 17<br />

1 Introduction<br />

Dans cette première partie on donne une introduction informelle aux systèmes hamiltoniens<br />

et à la théorie des perturbations des systèmes hamiltoniens intégrables. Le<br />

lecteur averti pourra passer directement au chapitre suivant présentant les résultats<br />

de cette thèse. De plus, il retrouvera par la suite quelques rappels plus condensés et<br />

spécialisés dans l’introduction de chaque chapitre. Une référence générale pour cette<br />

introduction est [AKN06].<br />

1.1 Systèmes hamiltoniens<br />

On commence par quelques rappels généraux sur les systèmes hamiltoniens.<br />

1.1.1 Équations de la mécanique classique<br />

1.1.1.1. Le point de départ de la théorie des systèmes hamiltoniens, et plus généralement<br />

de toute la théorie des systèmes dynamiques, est l’étude de certaines équations<br />

différentielles d’ordre deux de la forme<br />

¨q = −∇V (q) (1)<br />

où q ∈ R n , V : R n → R est une fonction lisse (de classe au moins C 2 ), et où le gradient<br />

∇ dépend d’une structure euclidienne sur R n . Ces équations, décrivant l’évolution d’un<br />

point matériel (de position q et de masse unité) dans un champ de force conservatif<br />

(dérivant du potentiel V ), sont bien connues pour modéliser de nombreux problèmes issus<br />

de la mécanique classique et céleste. Bien entendu, sauf cas exceptionnels, il n’est pas<br />

possible "d’intégrer" ces équations, et suivant Poincaré, il est devenu courant d’étudier<br />

le comportement asymptotique des solutions.<br />

On peut facilement transformer cette équation d’ordre deux sur Rn (l’espace des<br />

configurations) en un système d’équations d’ordre un sur R2n (l’espace des phases),<br />

ceci en introduisant la vitesse (ou l’impulsion) p = ˙q ∈ Rn comme nouvelle variable.<br />

L’équation (1) est alors équivalente au système<br />

<br />

˙q = p<br />

(2)<br />

˙p = −∇V (q).<br />

Un calcul direct montre que l’énergie totale du système, c’est-à-dire la fonction<br />

H(q, p) = 1<br />

2 p2 + V (q),<br />

est constante durant l’évolution, et que le système s’écrit encore<br />

˙q = ∂pH(q, p)<br />

˙p = −∂qH(q, p).<br />

Il s’agit de la formulation hamiltonienne de l’équation (1), et l’énergie totale H est<br />

également appelée hamiltonien.<br />

(3)


18 Introduction<br />

En réalité, ces équations sont de nature plus intrinsèque. En effet, si J0 désigne la<br />

structure complexe standard de R2n , c’est-à-dire<br />

<br />

0 In<br />

J0 =<br />

∈ M2n(R)<br />

−In 0<br />

dans la base canonique (In étant la matrice identité de taille n), et si Ω0 est la forme<br />

bilinéaire alternée définie par la matrice J0, alors les solutions des équations (3) ne sont<br />

rien d’autre que les courbes intégrales du champ de vecteurs XH défini par<br />

d(q,p)H.Y = Ω0(XH(q, p), Y ),<br />

pour tout point (q, p) ∈ R 2n et tout vecteur Y ∈ R 2n . Si on utilise le produit intérieur,<br />

cela donne iXH Ω0 = dH. Notons que cette définition a bien un sens puisque Ω0 est non<br />

dégénérée et induit donc un isomorphisme entre R 2n (ou plus exactement son espace<br />

tangent) et son dual (son espace cotangent). Ainsi, les équations (3) s’écrivent encore<br />

( ˙q, ˙p) = XH(q, p). (4)<br />

On dit que XH est le champ de vecteurs hamiltonien, ou gradient symplectique, associé<br />

à H. Si ∇ désigne le gradient relatif à la structure euclidienne canonique de R 2n , on a<br />

l’égalité<br />

XH = J0∇H<br />

que l’on peut interpréter géométriquement de la manière suivante : un champ hamiltonien<br />

est un champ de gradient usuel que l’on fait "tourner d’un quart de tour".<br />

1.1.1.2. Donnons maintenant l’exemple du fameux problème des N corps, qui constitue<br />

la motivation originelle. On considère le mouvement de N particules, de position<br />

q1, . . .,qN et de masses m1, . . .,mN, évoluant dans un espace euclidien E et soumises à<br />

l’interaction gravitationnelle : l’attraction entre deux particules est proportionnelle au<br />

produit des masses et à l’inverse du carré de la distance mutuelle. On pourra bien sûr<br />

penser au système solaire où, en admettant que les planètes sont à symétrie sphérique,<br />

on peut, selon Newton, les remplacer par leurs centres de gravité en y concentrant leurs<br />

masses. En notant G la constante gravitationnelle, les équations s’écrivent<br />

mi¨qi = −G <br />

i=j<br />

mimj<br />

qi − qj<br />

||qi − qj|| 3,<br />

et elles sont bien définies sur l’ouvert "sans collisions"<br />

En introduisant le potentiel<br />

les équations s’écrivent sous la forme<br />

{q = (q1, . . .,qN) ∈ E N | qi = qj, i = j}.<br />

U(q) = G <br />

i=j<br />

¨q = −∇U(q)<br />

mimj<br />

||qi − qj|| ,


1.1 - Systèmes hamiltoniens 19<br />

où le gradient est pris relativement au produit scalaire euclidien sur E N pondéré par les<br />

masses, c’est-à-dire q.q ′ = N<br />

i=1 miqiq ′ i. Finalement le hamiltonien est donné par<br />

H = 1<br />

2<br />

N<br />

mi|| ˙qi|| 2 − G mimj<br />

||qi − qj|| .<br />

i=1<br />

Il est bien connu, et on le verra par la suite, qu’il est possible de résoudre ces équations<br />

pour N = 2, mais que c’est loin d’être le cas pour N ≥ 3.<br />

Il convient maintenant d’étendre l’étude dans deux directions. D’une part, dans les<br />

exemples concrets, comme le précédent, l’espace des configurations n’est jamais R n tout<br />

entier mais plutôt un ouvert ou une sous-variété; il s’agit donc de se placer sur une<br />

variété abstraite. D’autre part, d’un point de vue théorique évident il est préférable de<br />

considérer une fonction hamiltonienne arbitraire.<br />

1.1.2 Systèmes hamiltoniens sur une variété<br />

1.1.2.1. Les systèmes hamiltoniens sur les variétés peuvent se définir dans deux cadres<br />

distincts : celui des variétés symplectiques et celui plus général des variétés de Poisson.<br />

On va essentiellement s’intéresser au premier, le plus classique, et mentionner brièvement<br />

le second.<br />

Définition 1.1. Soit M une variété lisse. Une structure symplectique sur M est une<br />

2-forme différentielle Ω qui est :<br />

- non dégénérée, c’est-à-dire que Ωx est non dégénérée sur TxM pour tout x ∈ M ;<br />

- fermée, c’est-à-dire que dΩ = 0.<br />

On dit alors que le couple (M, Ω) est une variété symplectique.<br />

Puisque Ω est non dégénérée, on voit facilement que M est nécessairement de dimension<br />

paire, et qu’elle est orientable (Ω n = Ω ∧ · · · ∧ Ω est partout non nulle). Sous<br />

cette même hypothèse, on peut donner la définition suivante.<br />

Définition 1.2. Soit (M, Ω) une variété symplectique et H : M → R une fonction lisse.<br />

Le champ de vecteurs hamiltonien XH est défini par iXHΩ = dH, c’est-à-dire<br />

pour tout vecteur tangent Y ∈ TxM.<br />

i=j<br />

dxH.Y = Ωx(XH(x), Y ), x ∈ M,<br />

Notons que la définition d’un champ de vecteurs hamiltonien n’utilise que l’hypothèse<br />

de non-dégénérescence de Ω; cependant, on verra dans la suite en quoi l’hypothèse de<br />

fermeture sur Ω est importante.<br />

Sur toute variété symplectique (M, Ω) on peut trouver une structure presque complexe<br />

J (c’est-à-dire un endomorphisme du fibré tangent dont le carré vaut moins l’identité)<br />

et une structure riemannienne g compatible avec Ω dans le sens où g(X, Y ) =<br />

Ω(X, JY ). Alors, on a encore la formule XH = J∇gH où ∇g est le gradient relatif à g.<br />

1.1.2.2. Donnons maintenant deux exemples de variétés symplectiques. Le premier, et<br />

le plus simple, est celui d’un espace vectoriel V muni d’une forme bilinéaire alternée non


20 Introduction<br />

dégénérée. Il n’est pas difficile de construire alors une base de V (dite symplectique)<br />

dans laquelle la forme bilinéaire se représente par la matrice J0 précédemment définie.<br />

En d’autres termes, tout espace vectoriel symplectique est linéairement isomorphe<br />

à R 2n muni de sa structure symplectique standard Ω0. Pour une variété symplectique<br />

quelconque, le résultat précédent reste vrai localement par le théorème classique de Darboux<br />

: on peut toujours trouver localement des coordonnées (q, p), dites symplectiques,<br />

dans lesquelles la forme Ω s’exprime par<br />

Ω = dq ∧ dp.<br />

C’est ici qu’intervient l’hypothèse de fermeture sur Ω (cette dernière est donc une hypothèse<br />

de "courbure nulle"), et sans elle il n’est donc pas possible de faire des calculs<br />

explicites sur une variété symplectique abstraite.<br />

Le second, et le plus important en ce qui nous concerne, est celui du fibré cotangent<br />

d’une variété N : c’est l’espace des phases des équations de la mécanique classique.<br />

On dispose sur M = T ∗N d’une 1-forme différentielle canonique σ que l’on peut définir<br />

de la manière suivante : si q = (q1, . . .,qn) sont des coordonnées locales sur N et<br />

p = (p1, . . .,pn) les coordonnées locales associées sur la fibre T ∗ q N, alors σ = pdq. On<br />

peut vérifier que cela ne dépend pas du choix des coordonnées (ou bien on peut définir<br />

σ de manière intrinsèque, quoique cela ne soit pas très instructif), et que σ vérifie<br />

l’identité α ∗ σ = α pour toute 1-forme α sur M. La structure symplectique canonique de<br />

T ∗ N est alors définie par Ω0 = −dσ, c’est-à-dire Ω0 = dq ∧ dp en coordonnées. Notons<br />

que les fibrés cotangents ne sont jamais compacts, et qu’ils ont la propriété supplémentaire<br />

immédiate d’être des variétés symplectiques exactes, c’est-à-dire que leur forme<br />

symplectique possède une primitive.<br />

1.1.2.3. Pour conclure, mentionnons le cadre des variétés de Poisson, qui généralise celui<br />

des variétés symplectiques dans le sens où l’on autorise la forme fermée Ω à "dégénérer".<br />

Commençons par une définition plus simple.<br />

Définition 1.3. Une structure de Poisson sur M est la donnée d’une application<br />

{., .} : C ∞ (M) × C ∞ (M) −→ C ∞ (M)<br />

bilinéaire, symétrique et qui vérifie pour tout f, g, h ∈ C ∞ (M), les relations<br />

- {f, {g, h}} + {g, {h, f}} + {h, {f, g}} = 0;<br />

- {fg, h} = f{g, h} + {f, h}g.<br />

On appelle crochet de Poisson une telle application, et les deux relations sont souvent<br />

appelées identité de Jacobi et identité de Leibniz. On dit alors que (C ∞ (M), {., .}) est<br />

une algèbre de Poisson, c’est-à-dire une algèbre de Lie telle que pour toute fonction H<br />

sur M, l’application {., H} est une dérivation de C ∞ (M). En particulier, si M est de<br />

dimension finie, il existe un unique champ de vecteurs XH associé à cette dérivation, et<br />

c’est par définition le champ de vecteurs hamiltonien engendré par H.<br />

Une variété symplectique (M, Ω) est naturellement une variété de Poisson si l’on<br />

pose<br />

{f, g} = Xgf = Ω(Xf, Xg),<br />

et l’identité de Jacobi est alors équivalente à la fermeture de Ω ou bien encore à l’existence<br />

locale de coordonnées de "Poisson" (q, p) dans lesquelles le crochet s’exprime de


1.1 - Systèmes hamiltoniens 21<br />

manière classique<br />

{f, g} =<br />

n<br />

i=1<br />

∂qif∂pi g − ∂pif∂qi g.<br />

Les variétés de Poisson sont donc plus générales, mais pour mieux saisir la différence il<br />

est nécessaire de les définir en termes un peu plus sophistiqués de géométrie différentielle.<br />

En effet, la définition du crochet de Poisson implique que l’on peut l’écrire<br />

{f, g} = ˜ Ω(df, dg),<br />

pour un certain champ de bi-vecteurs ˜ Ω ∈ Λ 2 TM vérifiant [ ˜ Ω, ˜ Ω] = 0 (ou [., .] est le<br />

crochet de Schouten, qui généralise le crochet de Lie des champs de vecteurs). Le cas où<br />

˜Ω est en tout point non dégénérée nous donne par dualité une structure symplectique<br />

Ω. Dans le cas général, l’ensemble des points où le rang est constant (nécessairement<br />

pair) forme une sous-variété qui hérite naturellement d’une structure symplectique, et<br />

l’ensemble de ces sous-variétés forme un "feuilletage" symplectique (où la dimension des<br />

feuilles varient).<br />

1.1.3 Quelques propriétés générales<br />

1.1.3.1. On a déjà vu que la définition d’un champ de vecteurs hamiltonien ne requiert<br />

que la donnée d’une fonction H, le hamiltonien (la structure symplectique sous-jacente<br />

étant fixée), ce qui les distingue beaucoup des champs de vecteurs arbitraires. Si l’on<br />

pense en termes d’équations différentielles, il est souvent pratique d’effectuer des changements<br />

de variables pour simplifier l’étude, dans le cas des systèmes hamiltoniens cette<br />

opération est très facile. Commençons par préciser la notion de changement de variables<br />

dans ce contexte, où il faut bien sûr préserver la forme hamiltonienne des équations.<br />

Définition 1.4. On dit qu’une application Φ : (M, Ω) → (M, Ω) est symplectique si<br />

Φ ∗ Ω = Ω.<br />

De manière équivalente, Φ préserve le crochet de Poisson dans le sens où<br />

{f ◦ Φ, g ◦ Φ} = {f, g},<br />

pour toutes fonctions f, g ∈ C ∞ (M). Lorsque Φ est un difféomorphisme symplectique<br />

et XH un champ de vecteurs hamiltonien, effectuer un changement de variables à l’aide<br />

de Φ revient à étudier le champ de vecteurs Φ ∗ XH.<br />

Proposition 1.5 (Changement de variables). Si Φ est un difféomorphisme symplectique,<br />

alors Φ ∗ XH = XH◦Φ.<br />

Autrement dit, pour transformer les champs de vecteurs hamiltoniens, il suffit de<br />

transformer les fonctions hamiltoniennes. C’est certainement une des propriétés les<br />

plus importantes du formalisme hamiltonien. On verra de plus qu’il est très facile de<br />

construire des difféomorphismes symplectiques (sous bien des aspects, on peut dire que<br />

le groupe des difféomorphismes symplectiques est un "gros" groupe).<br />

1.1.3.2. Maintenant, si l’on voit plutôt les systèmes hamiltoniens en termes dynamiques,<br />

il est naturel de chercher des propriétés de préservation, c’est-à-dire des objets


22 Introduction<br />

(ensembles, fonctions, mesures, · · ·) qui sont invariants par le flot. On notera LXH la<br />

dérivée de Lie par rapport au champ de vecteurs XH.<br />

La première propriété de préservation, à l’origine même des systèmes hamiltoniens,<br />

est la préservation de l’énergie.<br />

Proposition 1.6 (Préservation de l’énergie). On a LXHH = 0.<br />

En d’autres termes, la fonction H est constante le long du flot de XH. Chaque orbite<br />

est donc contenue dans un niveau d’énergie H = c, qui est une sous-variété de dimension<br />

2n − 1, si c est une valeur régulière du hamiltonien. De manière infinitésimale, cela se<br />

traduit par le fait que le champ de vecteurs reste toujours tangent au niveau d’énergie.<br />

Cette propriété permet donc de réduire la dimension de l’espace des phases, en particulier<br />

pour un hamiltonien à un degré de liberté (l’espace des phases est de dimension deux),<br />

les orbites sont implicitement définies par l’équation H(q(t), p(t)) = c.<br />

La seconde propriété de préservation, la plus importante du point de vue dynamique,<br />

est la préservation de la forme symplectique Ω.<br />

Proposition 1.7 (Préservation de la forme symplectique). On a LXHΩ = 0.<br />

Autrement dit Ω est constante le long des orbites de XH. Il en est donc de même<br />

pour la forme volume Ω n = Ω ∧ · · · ∧ Ω, et pour sa restriction à un niveau d’énergie.<br />

Si le niveau d’énergie est de volume fini (par exemple dans le cas compact), on peut<br />

appliquer le théorème de récurrence de Poincaré et en déduire que presque toute orbite<br />

est récurrente. La situation diffère donc radicalement des champs de gradients associés<br />

à une métrique riemannienne, où il est facile de voir que la dynamique non triviale<br />

se concentre sur les points fixes (solutions d’équilibres). Les systèmes hamiltoniens sont<br />

donc des systèmes conservatifs particuliers et la récurrence des orbites est un phénomène<br />

typique.<br />

1.1.3.3. Pour conclure, notons que l’on sait dire peu de choses sur les systèmes hamiltoniens<br />

en toute généralité. On va alors adopter la stratégie suivante. On va commencer par<br />

étudier une classe particulière de systèmes hamiltoniens, les systèmes intégrables, que<br />

l’on comprend parfaitement et on va ensuite s’intéresser aux perturbations de systèmes<br />

hamiltoniens intégrables, que l’on appelle systèmes presque-intégrables.<br />

Cette démarche peut se justifier de la manière suivante. D’une part, d’un point de vue<br />

"mathématique", puisque le problème général est difficile il est naturel de commencer<br />

par exhiber une classe de systèmes simples puis d’étudier ce qui se passe "autour".<br />

D’autre part, bien que les systèmes intégrables soient "exceptionnels", un grand nombre<br />

de problèmes "physiques" se ramènent à une perturbation d’un système intégrable, en<br />

particulier c’est le cas du système solaire.<br />

1.2 Systèmes hamiltoniens intégrables<br />

Littéralement, les systèmes hamiltoniens intégrables sont ceux que l’on sait "intégrer",<br />

c’est-à-dire qu’ils possèdent "suffisamment" d’intégrales premières pour que l’on<br />

puisse déterminer "explicitement" leurs solutions.


1.2 - Systèmes hamiltoniens intégrables 23<br />

1.2.1 Structure des systèmes intégrables<br />

1.2.1.1. Donnons nous une variété symplectique (M, Ω) de dimension 2n et une fonction<br />

lisse H : M → R. On commence par rappeler la définition suivante.<br />

Définition 1.8. On dit qu’une fonction lisse f : M → R est une intégrale première du<br />

champ hamiltonien XH si f est constante le long des orbites de XH.<br />

En d’autres termes, la dérivée LXHf est nulle, ou bien encore f commute au sens de<br />

Poisson avec H, c’est-à-dire {f, H} = 0. La conservation de l’énergie se traduit ici par le<br />

fait que les systèmes hamiltoniens possèdent toujours au moins une intégrale première,<br />

à savoir le hamiltonien H lui-même.<br />

1.2.1.2. On peut maintenant définir les systèmes intégrables (au sens de Liouville) et<br />

préciser ce que l’on entend par posséder "suffisamment" d’intégrales premières.<br />

Définition 1.9. On dit que le hamiltonien H est intégrable au sens de Liouville s’il<br />

possède n intégrales premières F = (f1, . . .,fn) indépendantes et en involution, c’est-àdire<br />

:<br />

- {fi, H} = 0, pour i ∈ {1, . . .,n};<br />

- {fi, fj} = 0, pour i, j ∈ {1, . . ., n};<br />

- F : M → R n est une submersion (i.e est partout de rang n).<br />

Commentons un peu la définition. Premièrement, notons que la définition ne demande<br />

que n intégrales premières (on aurait pu choisir fi = H pour un certain i dans<br />

{1, . . ., n}) : c’est une des spécificités des systèmes hamiltoniens, car un champ de vecteurs<br />

arbitraire sur une variété de dimension 2n nécessiterait a priori 2n − 1 intégrales<br />

premières pour pouvoir être intégrable.<br />

Ensuite, on a demandé que l’application F soit une submersion sur M tout entier,<br />

ce qui est une hypothèse très forte et rarement réalisée. Généralement, on se contente<br />

d’imposer que F soit une submersion sur un "gros" ouvert O de M, où par "gros" on<br />

entend par exemple dense ou dont le complémentaire est de codimension positive. On a<br />

donc choisit une définition simplifiée ici, le lieu singulier M \ O pouvant être la source<br />

de fortes complications (y compris dynamiques) que l’on va ignorer.<br />

Enfin, on voit immédiatement d’après les hypothèses que F définit un feuilletage sur<br />

M dont les feuilles (qui sont les composantes connexes des fibres F −1 (c), c ∈ R n ) sont<br />

des sous-variétés invariantes par le flot. Elles sont de plus lagrangiennes, c’est-à-dire<br />

que la restriction de la forme symplectique à ces sous-variétés s’annule et qu’elles sont<br />

de dimension maximale pour cette propriété (à savoir n, la moitié de la dimension de<br />

l’espace des phases). Réciproquement, si une application F = (f1, . . .,fn) : M → R n<br />

définie un feuilletage en sous-variétés invariantes lagrangiennes, alors les composantes<br />

de F sont indépendantes et en involution, autrement dit le système est intégrable au<br />

sens de Liouville.<br />

1.2.1.3. Le théorème principal sur la structure des systèmes hamiltoniens intégrables<br />

est le suivant.<br />

Théorème 1.10. Soit (H, F) un système intégrable au sens de Liouville et c ∈ R n telle<br />

que N = F −1 (c) soit compacte connexe. Alors :


24 Introduction<br />

(i) N est un tore plongé;<br />

(ii) il existe un voisinage ouvert U de N dans M, un voisinage ouvert B de 0 dans<br />

R n et un difféomorphisme Ψ : T n × B → U tels que<br />

- Ψ(T n × {0}) = N ;<br />

- Ψ ∗ Ω|U = Ω0, où Ω0 = dθ ∧ dI, (θ, I) ∈ T n × B ;<br />

- H ◦ Ψ ne dépend que des variables I ∈ B, autrement dit il existe h : B → R<br />

telle que H ◦ Ψ(θ, I) = h(I), pour tout (θ, I) ∈ T n × B.<br />

Les coordonnées (θ, I) ∈ T n ×B sont appelées "action-angle" (les angles sont θ et les<br />

actions I, on devrait donc dire "angle-action"). Ce résultat est généralement attribué à<br />

Liouville, Arnold, Jost et parfois Mineur, ou seulement une combinaison partielle de ces<br />

quatre auteurs (qui varie beaucoup selon le pays dans lequel on se trouve). On pourra<br />

trouver une preuve dans [AA89], [HZ94] ou [Dui80].<br />

Ajoutons quelques remarques sur les hypothèses du théorème précédent.<br />

Si on ne suppose pas la compacité de la feuille N = F −1 (c), il n’est pas difficile de<br />

voir que celle-ci est alors difféomorphe à un produit T n−k × R k , 0 ≤ k ≤ n.<br />

De plus, il existe des généralisations du théorème pour les systèmes proprement<br />

dégénérés (ou super-intégrables, ou non-commutativement intégrables) qui possèdent<br />

m > n intégrables premières (et qui ne sont donc plus toutes en involution), voir par<br />

exemple [Nek72] et [MF78].<br />

Enfin, dans le cas particulier des hamiltoniens de Tonelli (définis sur les cotangents<br />

M = T ∗ X, où X est une variété compacte, et qui vérifient les hypothèses de convexité<br />

usuelles de la théorie de Mather), un résultat récent de Sorrentino ([Sor09]) montre qu’il<br />

est possible d’enlever complètement l’hypothèse d’involution sur les intégrales premières<br />

tout en conservant des informations intéressantes : plus précisément, il montre que pour<br />

toute classe de cohomologie c ∈ H 1 (X, R), le système possède un unique graphe lagrangien<br />

invariant Λc de classe c sur lequel le flot est récurrent, et si X = T n , alors chaque<br />

Λc est un tore sur lequel le flot est linéaire.<br />

1.2.1.4. Désormais, on ne considèrera (presque exclusivement) que des systèmes hamiltoniens<br />

définis sur le produit T n ×B, où B est une boule de R n centrée en 0 (B pouvant<br />

être égale à R n ), muni de coordonnées action-angle (θ, I) et de la forme symplectique<br />

canonique Ω0 = dθ ∧ dI. Dans ce contexte, on s’autorise la définition abusive suivante.<br />

Définition 1.11. On dit qu’un système hamiltonien H est intégrable s’il ne dépend que<br />

des variables d’action I ∈ B, c’est-à-dire s’il existe h : B → R telle que<br />

pour tout (θ, I) ∈ T n × B.<br />

H(θ, I) = h(I),<br />

Ainsi, par le théorème de structure précédent, un tel système intégrable est le modèle<br />

local des systèmes intégrables au sens de Liouville sur une variété symplectique<br />

quelconque. Bien entendu, une meilleure terminologie serait "trivialement intégrable"<br />

ou "intégrable en coordonnées action-angle".<br />

Les solutions des équations associées à un système intégrable sont globales et expli-


1.2 - Systèmes hamiltoniens intégrables 25<br />

cites : elles s’écrivent ˙θ = ∇h(I)<br />

˙<br />

I = 0<br />

et, pour la condition initiale (θ0, I0), elles s’intègrent en<br />

<br />

θ(t) = θ0 + t∇h(I0) [Z n ]<br />

I(t) = I0.<br />

Ainsi les variables d’action sont constantes le long du mouvement, ce qui se traduit<br />

géométriquement par le fait que l’espace des phases T n × B se décompose trivialement<br />

en tores lagrangiens invariants T0 = T n × {I0}, paramétrés par les variables d’action<br />

I0 ∈ B, et sur lesquels le champ de vecteurs est constant égal à ω0 = ∇(I0).<br />

1.2.1.5. Avant d’expliquer la dynamique relativement simple des systèmes intégrables,<br />

donnons l’exemple du problème à 2 corps. Considérons deux particules q0 et q1, de masses<br />

m0 et m1, dont les équations sont<br />

q1 − q0<br />

q0 − q1<br />

¨q0 = Gm1<br />

||q1 − q0|| 3, ¨q1 = Gm0<br />

||q0 − q1|| 3.<br />

On se ramène à un problème de Kepler, c’est-à-dire au cas où l’un des deux corps est<br />

fixe, de la manière suivante. Il n’est pas difficile de voir qu’en notant r = q1 − q0 la<br />

position relative des deux corps et s = m0q0 + m1q1 le centre de masse, les équations<br />

précédentes se réduisent aux équations<br />

¨s = 0, ¨r = −µ r<br />

||r|| 3,<br />

(5)<br />

où µ = G(m0 + m1). La première équation de (5) signifie que le centre de masse a une<br />

accélération nulle, cela correspond au principe d’inertie. En plaçant le centre de masse<br />

à l’origine, on obtient<br />

q0 = − m1<br />

r, q1 =<br />

m0 + m1<br />

m0<br />

r.<br />

m0 + m1<br />

Il suffit donc d’étudier l’évolution de r, c’est-à-dire de résoudre la seconde équation<br />

de (5) qui correspond à l’évolution d’une particule de position r subissant l’attraction<br />

d’un corps fixe. Le hamiltonien s’écrit<br />

H = 1<br />

2 ||˙r||2 − µ<br />

r .<br />

Ce dernier est intégrable (au sens de Liouville) : outre le hamiltonien, le moment cinétique<br />

r ∧ ˙r est préservé, ainsi qu’une autre intégrale (appelée intégrale de Laplace).<br />

On peut vérifier de plus que ces intégrales premières sont indépendantes. Si l’espace des<br />

configurations est de dimension 3, on dispose de 5 intégrales premières indépendantes<br />

(alors qu’il en suffit de trois). On peut montrer que les orbites sont des coniques, et<br />

qu’aux énergies négatives ce sont des ellipses, que l’on peut complètement déterminer<br />

par leur demi-grand axe, leur excentricité ainsi que les trois angles d’Euler qui repèrent<br />

le demi-grand axe dans l’espace.<br />

On n’expliquera pas la construction des variables action-angle dans ce contexte,<br />

qui est due à Delaunay (voir [Che89] ou [CC07]). On va se contenter de mentionner<br />

que le hamiltonien écrit en action-angle est proprement dégénéré : il est de la forme<br />

h(I) = h(I1), pour I = (I1, I2, I3), et la variable I1 ne dépend que de la valeur du<br />

demi-grand axe.


26 Introduction<br />

1.2.2 Dynamique des systèmes intégrables<br />

1.2.2.1. Pour comprendre la dynamique d’un système intégrable, on est en particulier<br />

amené à étudier la dynamique d’un champ de vecteurs constant égal à ω ∈ R n sur le<br />

tore, qui engendre bien sûr un flot linéaire (ou un flot de translation de vecteur ω). De<br />

tels flots (ou tores) sont parfois appelés flots (ou tores) de Kronecker de fréquence ω, les<br />

solutions sont qualifiées de quasi-périodiques. Leur dynamique est entièrement comprise,<br />

elle dépend d’une propriété algébrique du vecteur fréquence ω. Plus précisément, à un<br />

tel vecteur on associe un module de résonance<br />

M(ω) = {k ∈ Z n | k.ω = 0} = ω ⊥ ∩ Z n ,<br />

où le . désigne le produit scalaire euclidien de R n , et ⊥ l’orthogonal relativement à ce<br />

produit. C’est un sous-module de Z n , et son rang dicte la dynamique.<br />

Si le rang est nul, autrement dit si<br />

k ∈ Z n , k.ω = 0 =⇒ k = 0,<br />

on dit que la fréquence est non résonante (et par extension le tore est dit non-résonant).<br />

Dans ce cas, il n’est pas difficile de montrer que le flot est minimal (toutes les orbites sont<br />

denses) et uniquement ergodique (il existe une unique mesure de probabilité invariante<br />

par le flot, qui dans ce cas n’est rien autre que la mesure de Haar), et c’est le prototype<br />

de système dynamique à spectre discret.<br />

Si le rang de M(ω) vaut m ≥ 1, la fréquence est dite résonante de multiplicité m, et<br />

dans ce cas le tore T se décompose en une famille continue à m paramètres de sous-tores<br />

invariants, chacun de dimension n − m, sur lesquels le flot est à nouveau minimal et<br />

uniquement ergodique. Par exemple, si m = n, on obtient trivialement un tore invariant<br />

constitué de points fixes et si m = n − 1, le tore est feuilleté en orbites périodiques de<br />

même période.<br />

1.2.2.2. Pour conclure, mentionnons le cas particulier des résonances standard de multiplicité<br />

m, qui par définition sont de la forme (ˆω, 0) ∈ R n−m × R m , avec ˆω ∈ R n−m<br />

non résonant. Le module de résonance de (ˆω, 0) est alors engendré par les m derniers<br />

vecteurs de la base canonique de Z n , autrement dit<br />

M(ˆω, 0) = {k ∈ Z n | k1 = · · · = kn−m = 0}.<br />

Le terme "standard" est justifié par la proposition suivante.<br />

Proposition 1.12. Soit ω une fréquence résonante de multiplicité m. Alors il existe<br />

une matrice A ∈ GLn(Z) telle que<br />

où ˆω ∈ R n−m est non résonant.<br />

ω = A(ˆω, 0),<br />

Ce résultat est très utile lorsque l’on étudie une résonance fixée : en effet A préserve<br />

Z n , elle induit alors un automorphisme linéaire du tore T n qui se relève en un difféomorphisme<br />

linéaire symplectique ΦA(θ, I) = (Aθ, t A −1 I), et on se ramène ainsi à une<br />

résonance standard, quitte à considérer le hamiltonien H ◦ ΦA.


1.3 - Théorie classique des perturbations 27<br />

La preuve de la proposition précédente est une simple conséquence du théorème<br />

de structure des modules de type fini sur les anneaux principaux, plus précisément du<br />

théorème suivant (dit "de la base adaptée", voir [Lan02]) : on peut trouver une base<br />

(f1, . . .,fn) de Zn et des éléments d1, . . ., dm de Z, avec les divisibilités d1| . . . |dm, tels que<br />

(d1f1, . . .,dmfm) soit une base de M(ω). Il suffit alors de choisir A comme la transposée<br />

de la matrice de passage de la base canonique (e1, . . .,en) vers la base (f1, . . .,fn). On<br />

peut également raisonner en termes matriciels : si l’on choisit une base de vecteurs<br />

entiers (k1, . . .,km) de M(ω), on lui associe la matrice<br />

⎛ ⎞<br />

⎜<br />

K = ⎝ .<br />

k 1 1 · · · k n 1<br />

.<br />

k 1 m · · · kn m<br />

⎟<br />

⎠ ∈ Mm,n(Z),<br />

où ki = (k1 i , . . .,kn i ), 1 ≤ i ≤ m. Par le théorème de la base adaptée, cette dernière est<br />

alors équivalente à la matrice diagonale<br />

⎛<br />

d1<br />

⎜<br />

∆ = ⎝<br />

. ..<br />

0 0 · · ·<br />

.<br />

. ..<br />

⎞<br />

0<br />

⎟<br />

. ⎠ ∈ Mm,n(Z),<br />

0 dm 0 · · · 0<br />

plus précisément, K = B∆A pour B ∈ GL(m, Z) et la matrice A ∈ GL(n, Z) convient.<br />

1.3 Théorie classique des perturbations<br />

1.3.0.1. On s’intéresse désormais aux systèmes hamiltoniens presque-intégrables, c’està-dire<br />

aux perturbations de systèmes hamiltoniens intégrables. Ils sont définis par des<br />

hamiltoniens de la forme <br />

H(θ, I) = h(I) + f(θ, I)<br />

(∗)<br />

|f|D < ε


28 Introduction<br />

D’un autre point de vue, les variables d’action I(t) des systèmes intégrables sont<br />

constantes pour tout temps.<br />

Problème 3. Étudier l’évolution des variables d’action I(t) des solutions du système<br />

(∗).<br />

1.3.0.3. Ce dernier problème est motivé par des questions de mécanique céleste, en<br />

particulier par le problème planétaire. Ici on considère un problème à 1 + N corps,<br />

typiquement le Soleil et des planètes du système solaire, de position q0, q1, . . ., qN et de<br />

masse m0, m1, . . .,mN. On suppose que la masse du premier corps est beaucoup plus<br />

grosse que la masse des autres corps, et on note<br />

ε = max<br />

1≤i≤N<br />

mi<br />

Dans le cas du système solaire, ε est de l’ordre de 10 −3 . En première approximation, on<br />

peut donc négliger l’interaction des planètes entre elles pour ne considérer que l’attraction<br />

du corps massif. Cela revient à prendre ε = 0 et le système se découple alors en un<br />

produit de N problèmes de Kepler que l’on sait intégrer. En particulier, les orbites planétaires<br />

sont des ellipses, et le hamiltonien écrit en coordonnées action-angle ne dépend<br />

que des demi-grands axes. Lorsque l’on restaure l’interaction des planètes, le système<br />

planétaire devient une perturbation, de taille environ 10 −3 , d’un système intégrable et<br />

étudier la stabilité des variables d’action revient à étudier la "déformation" éventuelle<br />

des trajectoires elliptiques.<br />

Dans la section suivante on exposera les deux théorèmes fondamentaux qui donnent<br />

des éléments de réponses aux questions précédentes, le théorème des tores invariants<br />

(théorème KAM) et le théorème de Nekhoroshev, mais avant cela on va commencer par<br />

expliquer le principe commun sur lesquels ils reposent, qui est la construction de formes<br />

normales. D’excellentes références sur ce sujet sont [LM88], [Ben05], [Ber06] et bien sûr<br />

[AKN06].<br />

1.3.1 Principe de moyennisation<br />

Le principe de moyennisation est un principe physique qui remonte aux travaux de<br />

Gauss, Lagrange et Laplace en mécanique céleste.<br />

D’un point de vue moderne, on peut l’énoncer ainsi : on introduit la moyenne spatiale<br />

de la perturbation<br />

<br />

〈f〉(I) = f(θ, I)dθ,<br />

T n<br />

et on définit le système moyenné 〈H〉 = h + 〈f〉, qui est évidemment intégrable. Alors<br />

le principe de moyennisation consiste à supposer que les solutions du système moyenné<br />

sont de "bonnes approximations" de celles du système complet H = h + f.<br />

L’idée, très naïve, qui sous-tend le principe est la suivante : on écrit le développement<br />

en série de Fourier<br />

f(θ, I) = 〈f〉(I) + <br />

ˆfk(I)e 2πik.θ ,<br />

m0<br />

k∈Z n \{0}<br />

<br />

.


1.3 - Théorie classique des perturbations 29<br />

alors les termes ˆ fk(I)e 2πik.θ , pour k ∈ Z n \ {0}, représentent des oscillations qui se<br />

superposent et peuvent donc être négligées.<br />

En étant un peu moins naïf, on voit que ce type de raisonnement n’est valable<br />

que si la dynamique sur chaque tore invariant du système intégrable est suffisamment<br />

"équirépartie", ce qui n’est bien sûr pas toujours le cas. Dans la section suivante, on va<br />

expliquer les problèmes qui se posent en essayant de justifier rigoureusement ce principe.<br />

1.3.2 Théorie des formes normales<br />

1.3.2.1. De manière générale, mais un peu vague, la théorie des formes normales en<br />

systèmes dynamiques consiste à conjuguer (localement ou globalement) le système à un<br />

système "plus simple", autrement dit on cherche à construire des coordonnées simplifiant<br />

le problème.<br />

Pour un système hamiltonien presque-intégrable, le but est bien entendu de le conjuguer<br />

à un système le plus intégrable possible, et d’après la section précédente, le candidat<br />

idéal serait le système moyenné 〈H〉, qui est intégrable. Il faut également demander que<br />

la conjugaison soit symplectique, afin de conserver la structure des équations, et qu’elle<br />

soit ε-proche de l’identité, afin que les solutions du système transformé restent ε-proches<br />

de celles du système initial.<br />

1.3.2.2. Il faut donc commencer par expliquer comment on peut construire de telles<br />

transformations. Deux procédés sont couramment utilisés. La première méthode est<br />

celle des fonctions génératrices, que l’on ne va pas expliquer car on ne va pas l’utiliser.<br />

On utilisera plutôt un second procédé, appelé méthode de Lie, qui a l’avantage de<br />

donner des calculs plus agréables (elle présente aussi l’intérêt de se généraliser aussitôt<br />

en dimension infinie). Sous sa forme la plus simple, la méthode de Lie consiste à choisir<br />

pour transformation Φ le temps 1 du flot d’un champ de vecteurs hamiltonien associé à<br />

une fonction auxiliaire χ, c’est-à-dire Φ = Φ χ<br />

1. Un tel difféomorphisme est évidemment<br />

symplectique (et même exact-symplectique si la variété est exacte), et il est ε-proche<br />

de l’identité si la fonction χ est de taille ε (de manière équivalente on peut également<br />

prendre le temps ε du flot si la fonction est de taille 1).<br />

1.3.2.3. Revenons maintenant à notre problème de forme normale. En pratique, pour<br />

tenter de résoudre l’équation H ◦ Φ = 〈H〉, on utilise un schéma itératif pour éliminer<br />

la perturbation "ordre par ordre" en ε. À l’étape 1, on cherche Φ χ1 tel que<br />

H1 = H ◦ Φ χ1 = 〈H〉 + O(ε 2 ),<br />

c’est-à-dire qu’on cherche à éliminer les angles à l’ordre ε. Ensuite, à l’étape n, pour<br />

n ≥ 2, on a Hn−1 = 〈H〉 + O(ε n ) et on cherche Φ χn tel que<br />

Hn = Hn−1 ◦ Φ χn = 〈H〉 + O(ε n+1 ),<br />

c’est-à-dire qu’on cherche à éliminer les angles à l’ordre ε n . Si le produit infini<br />

Φ = Φ χ1 ◦ Φ χ2 ◦ · · · ◦ Φ χn ◦ · · ·<br />

converge, alors on obtient bien H ◦ Φ = 〈H〉.


30 Introduction<br />

1.3.2.4. Détaillons maintenant une étape de ce schéma. On se donne donc une fonction<br />

χ : T n × B → R de taille ε, on va calculer H ◦ Φ χ afin de voir comment choisir χ pour<br />

se rapprocher du système moyenné. Pour cela, à l’aide de la formule générale<br />

d<br />

(K ◦ Φχt<br />

) = {K, χ} ◦ Φ<br />

dt χ<br />

t<br />

et du théorème de Taylor avec reste intégral, on obtient<br />

H ◦ Φ χ<br />

1 = h ◦ Φ χ<br />

1 + f ◦ Φ χ<br />

1<br />

= h + {h, χ} +<br />

1<br />

= h + f + {h, χ} +<br />

0<br />

(1 − t){{h, χ}, χ} ◦ Φ χ<br />

t dt + f +<br />

1<br />

= 〈H〉 + f − 〈f〉 + {h, χ} +<br />

0<br />

1<br />

{(1 − t){h, χ} + f, χ} ◦ Φ χ<br />

t dt<br />

1<br />

0<br />

0<br />

{f, χ} ◦ Φ χ<br />

t dt<br />

{(1 − t){h, χ} + f, χ} ◦ Φ χ<br />

t dt<br />

où le terme d’ordre ε à éliminer est donné par f − 〈f〉 + {h, χ}, et le reste intégral est<br />

d’ordre ε 2 .<br />

Il faudrait donc choisir χ telle que<br />

{h, χ} + f − 〈f〉 = 0.<br />

C’est l’équation centrale de la théorie des perturbations des systèmes hamiltoniens,<br />

on parle souvent d’équation homologique ou d’équation linéarisée (car elle s’obtient en<br />

linéarisant l’équation de conjugaison de H à 〈H〉). Notons que par la méthode des<br />

fonctions génératrices on aurait trouvé la même équation (au signe près), cette dernière<br />

ne donne donc pas de meilleur résultat. L’équation homologique s’écrit encore<br />

∇h(I).∂θχ(θ, I) = f(θ, I) − 〈f〉(I),<br />

et à une action fixée I, donc à fréquence fixée ω = ∇h(I), on obtient tout simplement<br />

une équation aux dérivées partielles du premier ordre, linéaire et à coefficients constants,<br />

sur le tore T n :<br />

ω.∂θχ = g,<br />

où l’on a noté g = f − 〈f〉. Ce type d’équation se prête bien à l’analyse de Fourier. Dans<br />

notre cas si on écrit les développements en série<br />

g(θ) = <br />

ˆgke 2iπk.θ , χ(θ) = <br />

ˆχke 2iπk.θ ,<br />

k∈Z n<br />

alors ˆg0 = 0 puisque 〈f〉 = ˆ f0 (rappelons que l’on a fixé la variable I pour simplifier la<br />

présentation). On voit donc très facilement que la solution χ est formellement donnée<br />

par ˆχ0 = 0<br />

k∈Z n<br />

ˆχk = ˆ fk<br />

2iπk.ω , k ∈ Zn \ {0}.<br />

1.3.2.5. Notons Ω = ∇h(B) l’espace des fréquences. Pour ω ∈ Ω, à cause de la présence<br />

du produit scalaire k.ω au dénominateur, des problèmes fondamentaux, mis en évidence<br />

par Poincaré, se posent immédiatement.


1.3 - Théorie classique des perturbations 31<br />

Problème des résonances<br />

Si ω est résonante, le produit k.ω s’annule pour un certain multi-entier k non nul,<br />

et ainsi l’équation homologique ne possède tout simplement pas de solution formelle. Il<br />

fallait s’y attendre : dans ce cas le flot linéaire de fréquence ω n’est pas ergodique et<br />

l’approximation de la perturbation par sa moyenne spatiale n’a tout simplement aucun<br />

sens.<br />

De plus, sous des conditions générales sur h, l’ensemble des fréquences résonantes est<br />

dense dans Ω, ainsi pour une perturbation générique l’équation homologique ne possède<br />

de solutions sur aucun ouvert de T n × B : on peut dire que c’est à cause des résonances<br />

qu’une perturbation d’un système intégrable n’est "généralement" pas intégrable (voir<br />

[Koz96] pour plus de détails sur la non-intégrabilité).<br />

Problème des petits diviseurs<br />

Supposons maintenant la fréquence non résonante. On a alors l’existence d’une solution<br />

formelle, mais rien ne garantit la convergence de la solution. En effet, même si<br />

k.ω est non nul pour tout k ∈ Z n \ {0}, le produit scalaire peut (et va) devenir arbitrairement<br />

petit pour des multi-entiers de longueurs arbitrairement grandes, impliquant la<br />

divergence de la série. C’est le fameux phénomène des petits diviseurs.<br />

Enfin on peut également mentionner un troisième problème, qui n’est pas lié aux<br />

résonances ou petits diviseurs, mais qui est incontournable.<br />

Problème des grands multiplicateurs<br />

Il s’agit tout simplement du problème de la convergence du schéma itératif. En<br />

admettant que l’on sache faire face aux problèmes des résonances et petits diviseurs, on<br />

peut alors trouver un changement de variables Φ χ1 qui élimine la perturbation à l’ordre<br />

ε, puis Φ χ2 qui élimine la perturbation à l’ordre ε 2 et ainsi de suite, mais il reste à<br />

montrer la convergence du produit infini<br />

Φ = Φ χ1 ◦ Φ χ2 ◦ · · · ◦ Φ χn ◦ · · ·<br />

et c’est une question délicate. Le terme "grands multiplicateurs" est dû à Poincaré. En<br />

fait, le problème précédent peut se ramener à la convergence d’une série formelle, où le<br />

terme général an se trouve être de l’ordre de A n (n!) α , avec A, α > 0 et qui généralement<br />

diverge. Selon Poincaré, ce sont les séries convergentes au sens des astronomes mais<br />

divergentes au sens des géomètres, en termes modernes ce sont les séries asymptotiques<br />

(voir [Sau95]).<br />

Dans la section suivante, on va donc expliquer comment faire face à ce genre de<br />

problèmes.


32 Introduction<br />

1.4 Théorèmes de stabilité<br />

On va maintenant exposer assez brièvement les deux résultats fondamentaux de<br />

stabilité, le théorème KAM et le théorème de Nekhoroshev.<br />

Le premier résultat affirme que, sous certaines hypothèses, il existe beaucoup de<br />

solutions quasi-périodiques dans les systèmes presque-intégrables. On peut le voir comme<br />

un résultat de stabilité probabiliste : si l’on choisit une solution au hasard, alors avec<br />

une grande probabilité, qui de plus tend vers 1 lorsque ε tend vers zéro, elle est quasipériodique<br />

et on peut en déduire facilement que les variables d’action ont une variation<br />

d’ordre ε pour tout temps.<br />

Le second résultat concerne toutes les solutions, mais fournit bien sûr des résultats<br />

moins précis. Sous certaines conditions, les variables d’action de toute solution sont<br />

presque constantes pendant un intervalle de temps exponentiellement long par rapport<br />

à l’inverse de la taille de la perturbation. On peut le voir comme un résultat de stabilité<br />

"effective", plus "physique" que le précédent.<br />

1.4.1 Théorie KAM<br />

1.4.1.1. A l’origine, la théorie KAM, d’après Kolmogorov, Arnold et Moser, étudie la<br />

persistance de certaines solutions quasi-périodiques (autrement dit de tores invariants<br />

quasi-périodiques) pour des perturbations de systèmes hamiltoniens intégrables. On renvoie<br />

à [Bos86], [Pös01] ou [dlL01] pour de très bonnes introductions. Dans un sens plus<br />

large, on parle de théorie KAM (ou de méthodes KAM) lorsque se présente un problème<br />

de petits diviseurs en dynamique (linéarisation des difféomorphismes du cercle,<br />

des germes de fonctions holomorphes au voisinage d’un point fixe indifférent, réductibilité<br />

des cocycles quasi-périodiques proches d’un cocyle constant, ...), ou dans d’autres<br />

domaines des mathématiques (géométrie, équations aux dérivées partielles).<br />

1.4.1.2. Considérons donc un système presque-intégrable de la forme<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

|f|D < ε 0 et τ ≥ 0, un vecteur ω ∈ Rn est dit (γ, τ)-diophantien<br />

s’il vérifie l’inégalité<br />

|k.ω|1 ≥ γ|k| −τ<br />

1 ,<br />

pour tout k ∈ Z n \ {0}, où | . |1 désigne la norme ℓ 1 .


1.4 - Théorèmes de stabilité 33<br />

et<br />

On note D τ γ<br />

l’ensemble de ces vecteurs, et on pose<br />

Dγ = <br />

D τ γ , Dτ = <br />

τ>0<br />

D = <br />

τ>0,γ>0<br />

est l’ensemble des vecteurs diophantiens. Un tel vecteur ω ∈ D est donc "loin" des<br />

résonances. Lorsque |k|1 devient grand on peut alors contrôler la petitesse des diviseurs<br />

|k.ω|. Pour que D τ soit non vide, on montre facilement qu’il est nécessaire (et suffisant)<br />

que τ ≥ n −1. Si τ > n −1, il n’est alors pas difficile de prouver que D τ un ensemble de<br />

mesure de Lebesgue totale. En revanche, si τ = n −1, il est plus difficile de montrer que<br />

la mesure de Lebesgue est nulle et que la dimension de Hausdorff est maximale. Dans<br />

tous les cas, c’est un ensemble maigre au sens de la catégorie de Baire.<br />

1.4.1.3. Le premier résultat de persistance de tores invariants, contenant les principes<br />

fondamentaux de la théorie, est dû à Kolmogorov ([Kol54]). On dit que la partie intégrable<br />

h : B → R est non dégénérée au sens de Kolmogorov si en tout point l’application<br />

fréquence ∇h est un difféomorphisme local. On note Ω = ∇h(B) l’espace des fréquences,<br />

et<br />

Ωγ = {ω ∈ Ω ∩ Dγ, d(ω, ∂Ω) ≥ γ}<br />

un espace de fréquences diophantiennes suffisamment loin du bord de Ω. On peut alors<br />

énoncer de manière informelle le résultat suivant.<br />

Théorème 1.14 (Kolmogorov). On considère un système hamiltonien (∗), et on suppose<br />

que :<br />

(i) le système est analytique;<br />

(ii) h est non dégénérée au sens de Kolmogorov.<br />

Alors il existe une constante c qui ne dépend que de h telle que pour ε < cγ 2 , au voisinage<br />

de toute solution quasi-périodique du système non perturbé, de fréquence ω ∈ Ωγ, le<br />

système possède une solution quasi-périodique de même fréquence.<br />

En termes plus précis, la dernière phrase signifie que pour tout ω ∈ Ωγ, il existe un<br />

plongement<br />

Ψ ε ω : T n → T n × B,<br />

dont l’image T est invariante par le champ hamiltonien, et qui conjugue la restriction<br />

du champ à T à un flot linéaire sur T n de fréquence ω. De plus, si l’on suppose (sans<br />

perte de généralités) que le tore non perturbé de fréquence ω est situé en I = 0, alors<br />

Ψ ε ω est √ ε-proche du plongement standard Ψ 0 ω qui envoie Tn sur T n × {0}.<br />

Pour une exposition détaillée de la preuve de Kolmogorov, on pourra consulter<br />

[BGGS84].<br />

D τ γ<br />

γ>0<br />

D τ γ ,<br />

1.4.1.4. Les deux idées essentielles de la preuve sont les suivantes.<br />

La première idée, que l’on a déjà mentionnée, est de prescrire la fréquence initiale ω<br />

du tore recherché et de la choisir diophantienne. Dans l’approche de Kolmogorov, cela se<br />

traduit par la construction d’une conjugaison Φ du système à la forme normale suivante<br />

K(θ, I) = ω.I + O(|I| 2 )


34 Introduction<br />

dite "de Kolmogorov" qui possède, en I = 0, un tore invariant de fréquence ω. Le<br />

plongement Ψ ε ω est alors donné par Ψ ε ω = Φ ◦ Ψ 0 ω. Pour construire cette conjugaison,<br />

on peut procéder de manière itérative comme on l’a expliqué précédemment. À chaque<br />

étape, quitte à effectuer de petites translations dans l’espace des fréquences (ce qui est<br />

possible par la non-dégénérescence de h), on obtient une équation homologique que l’on<br />

sait résoudre grâce à l’hypothèse arithmétique sur la fréquence ω.<br />

Mais il reste le problème de la convergence du schéma itératif (les grands multiplicateurs),<br />

et c’est ici qu’intervient la seconde idée de Kolmogorov. Au lieu d’utiliser un<br />

schéma que l’on pourrait qualifier de linéaire, comme en théorie classique des perturbations,<br />

il utilise une méthode de Newton. Il s’agit d’un schéma de type quadratique :<br />

au bout de n étapes, au lieu d’avoir une erreur de l’ordre de εn+1 , elle est de l’ordre<br />

ε2n. Ceci permet de compenser les mauvaises estimations liées aux petits diviseurs et de<br />

garantir la convergence.<br />

1.4.1.5. Une preuve techniquement différente mais reposant sur les mêmes idées a<br />

été donnée ensuite par Arnold ([Arn63a]), et un théorème de préservation de courbes<br />

invariantes pour les perturbations de certains difféomorphismes intégrables de l’anneau<br />

A (en différentiabilité finie) a été démontré par Moser ([Mos62]), d’où l’acronyme KAM,<br />

pour Kolmogorov, Arnold et Moser.<br />

Pour une preuve plus moderne, on renvoie à [Pös01] pour ce qui est communément<br />

appelé la théorie KAM "à paramètres" : l’idée est de considérer les fréquences ω = ∇h(I)<br />

comme des paramètres indépendants des variables d’action I pour obtenir une seule<br />

transformation<br />

Φ : T n × Ω −→ T n × B<br />

qui "redresse" tous les tores invariants de fréquences ω ∈ Ωγ. Cela permet d’étudier<br />

plus simplement la question de la régularité des tores en fonction de la fréquence, et<br />

d’obtenir des renseignements plus quantitatifs. En particulier on montre plus facilement<br />

que la mesure relative du complémentaire de l’ensemble des tores invariants est d’ordre<br />

√ ε. Cette démarche est implicite chez Arnold ([Arn63a]) et Moser ([Mos67]), et complètement<br />

explicite chez Pöschel ([Pös82], voir aussi [CG82]).<br />

Il existe également beaucoup d’autres méthodes de preuve qui diffèrent plus ou moins<br />

du schéma classique proposé par Kolmogorov et complété par Arnold. Citons en quelques<br />

unes.<br />

Une première méthode, due essentiellement à Zehnder ([Zeh75], [Zeh76]) et à Herman<br />

([Bos86], [Féj04]) réduit le problème à l’application d’un théorème d’inversion local (dit<br />

de Nash-Moser) dans une certaine échelle d’espaces fonctionnels. Dans cette direction,<br />

on peut également consulter [Féj10] pour une preuve récente et plus simple.<br />

Une autre méthode de preuve, due à Eliasson ([Eli96]), consiste à analyser directement<br />

la convergence des séries classiques de la théorie des perturbations (séries de<br />

Lindstedt). Plus précisément, il s’agit d’ajouter des termes à la série puis de les regrouper<br />

de manière adéquate afin de mettre en évidence certaines compensations de signes<br />

entraînant la convergence de la série.<br />

Une autre approche encore, que l’on doit à Levi-Moser dans le cas des courbes invariantes<br />

([LM01]) et Zehnder-Salamon dans le cas général ([SZ89]), est basée sur l’utilisation<br />

du formalisme lagrangien.


1.4 - Théorèmes de stabilité 35<br />

Signalons également une preuve, due à Khanin, Lopes Dias, et Marklof ([KLDM06]<br />

et [KLDM07]) où l’analyse des petits diviseurs est remplacée par de l’approximation<br />

diophantienne simultanée, plus précisément par un procédé de renormalisation basé sur<br />

un algorithme multi-dimensionnel de fractions continues.<br />

Enfin, une méthode a été proposé récemment par Rüssmann ([Rüs09], d’après l’auteur<br />

ce travail remonte aux années 1980), méthode reprise ensuite par Pöschel ([Pös09])<br />

dans le cadre plus simple d’une perturbation d’un champ de vecteurs constant sur le<br />

tore. La nouveauté réside dans la disparition de toute convergence quadratique (ou plutôt<br />

super-linéaire), qui semblait pourtant essentielle. L’idée principale est de décomposer<br />

la perturbation f en une partie "infrarouge" f≤ et une partie "ultraviolette" f>, mais<br />

à la différence d’une décomposition classique selon l’ordre des coefficients de Fourier,<br />

Rüssmann introduit dans la partie ultraviolette une certaine fraction des coefficients de<br />

Fourier de faible ordre. Il obtient ainsi, après moyennisation, de meilleures estimées sur<br />

la partie infrarouge valables sur de plus grand domaines. L’itération du schéma est alors<br />

très simple, et de manière surprenante, la convergence est très lente (à la i-ème étape,<br />

le nouveau terme d’erreur fi vérifie une inégalité du type |fi| ≤ q i |f|, avec 0 < q < 1<br />

proche de 1).<br />

1.4.1.6. Notons également que toutes les hypothèses du théorème peuvent être affaiblies.<br />

En ce qui concerne la régularité, on sait depuis Moser ([Mos62]) que la théorie est<br />

valable en différentiabilité finie : on a besoin que le système soit de classe C k pour<br />

k > 2n (voir [Pös82], [Sal04], [SZ89] et [Alb07]), et l’on sait que cette hypothèse est<br />

essentiellement optimale pour n = 2 ([Her86], voir aussi [KT09] pour un résultat optimal<br />

dans le cadre voisin de la linéarisation des difféomorphismes du cercle). Le cas où le<br />

système est de classe Gevrey a été traité par Popov ([Pop04]).<br />

Pour ce qui est de l’hypothèse de non-dégénérescence, on peut remplacer la condition<br />

de Kolmogorov par la condition de non-dégénérescence iso-énergétique d’Arnold<br />

([Arn63a]), où l’on demande que l’application fréquence ne s’annule pas et que l’application<br />

"fréquence projective" associée (c’est-à-dire les n −1 rapports de fréquences) soit<br />

un difféomorphisme local en restriction à un niveau d’énergie. La conclusion en est donc<br />

un peu modifiée : les tores KAM ont la même énergie et la même fréquence "projective"<br />

que les tores non perturbés, mais plus nécessairement la même fréquence. Notons<br />

que l’hypothèse de non-dégénérescence de Kolmogorov et de non-dégénérescence d’Arnold<br />

sont complètement indépendantes. Dans [Arn63a], Arnold démontre également un<br />

théorème KAM pour des hamiltoniens intégrables proprement dégénérés, en vue d’une<br />

application au problème planétaire (voir aussi [Féj04]). Pour ce qui est de l’hypothèse la<br />

plus faible, minimale dans le cas analytique, elle est due à Rüssmann ([Rüs01]) : l’image<br />

de l’application fréquence ne doit pas être contenue dans un hyperplan.<br />

Enfin l’hypothèse arithmétique sur la fréquence peut également être affaiblie, il suffit<br />

d’utiliser une condition arithmétique de type Brjuno ([Rüs01]).<br />

1.4.1.7. Pour conclure, notons que la théorie KAM ne s’applique pas aux tores résonants.<br />

Ces derniers sont généralement détruits par la perturbation et donnent naissance<br />

à une dynamique "chaotique". Néanmoins, ces tores ne disparaissent pas totalement. En<br />

effet, considérons un tore m-résonant, qui se décompose en une famille à m paramètres<br />

de tores de dimension n − m, et écrivons sa fréquence ω = (ˆω, 0) ∈ R n−m × R m .


36 Introduction<br />

Pour m = n − 1, un tore résonant est feuilleté en orbites périodiques de même<br />

période. Bernstein et Katok ([BK87]) ont montré que si la partie intégrable est convexe,<br />

il existe au moins n orbites périodiques qui persistent. Pour m = 1, un résultat analogue<br />

est dû à Cheng ([Che99]), il démontre l’existence d’au moins deux tores de dimension<br />

n − 1, sous l’hypothèse que la partie intégrable soit non dégénérée et que la fréquence<br />

restreinte ˆω soit diophantienne.<br />

En revanche, pour les cas intermédiaires 1 < m < n−1, on ne dispose que de résultats<br />

partiels ([Tre91], [CW99]), qui requièrent des hypothèses supplémentaires sur la partie<br />

intégrable ou sur la perturbation. La conjecture est la suivante : dans un système non<br />

dégénéré, pour un tore m-résonant avec une fréquence restreinte ˆω diophantienne, il<br />

subsiste au moins m + 1, et génériquement 2 m , tores de dimension n − m. Autrement<br />

dit, leur nombre devrait être égal au nombre de points fixes d’une fonction régulière sur<br />

le tore T m .<br />

1.4.2 Théorie de Nekhoroshev<br />

1.4.2.1. On peut déduire de la théorie KAM que pour un système presque-intégrable<br />

suffisamment régulier et dont la partie intégrable n’est pas trop dégénérée, il existe une<br />

constante c ′ telle que pour ε assez petit, on ait<br />

|I(t) − I0| ≤ c ′√ ε, t ∈ R,<br />

pour "beaucoup" d’actions initiales I0 ∈ B, celles dont l’évolution est quasi-périodique.<br />

En effet, d’une part on sait que l’ensemble des tores KAM est gros au sens de la mesure<br />

car son complémentaire a une mesure relative d’ordre √ ε (il n’est pas de mesure totale<br />

à cause de la condition ε < cγ 2 ), et d’autre part chaque tore est √ ε-proche du tore non<br />

perturbé sur lequel les variables d’action du système non perturbé sont constantes, ce<br />

qui explique la variation des actions du système perturbé.<br />

Pour n = 2, cette propriété de stabilité est même vraie pour toute solution, dans le<br />

cadre du théorème KAM iso-énergétique d’Arnold : sur chaque niveau d’énergie, qui est<br />

de dimension 3, persiste une famille de tores invariants de dimension 2 telle que chaque<br />

composante connexe du complémentaire est bornée. Alors ou bien la solution est quasipériodique,<br />

et sa variation est d’ordre √ ε, ou bien elle est "coincée" entre deux solutions<br />

quasi-périodiques, et un argument utilisant la mesure des tores préservés montre que sa<br />

variation est également d’ordre √ ε.<br />

En 1964, Arnold ([Arn64]) a démontré qu’une telle propriété ne subsiste pas pour<br />

n ≥ 3. Il a construit un exemple de système hamiltonien à trois degrés de liberté qui<br />

possède une solution (θ(t), I(t)) telle que<br />

|I(τ) − I0| ≥ 1,<br />

avec τ = τ(ε), et ceci pour tout ε > 0 (on donnera plus de détails dans la section<br />

suivante). Donc pour n ≥ 3, la théorie KAM ne fournit pas de résultat de stabilité<br />

valable pour toutes les solutions.<br />

On dit alors qu’un système est effectivement stable si il existe des constantes positives<br />

b, c telles que pour toute action initiale I0,<br />

|I(t) − I0| ≤ cε b , |t| ≤ T(ε),


1.4 - Théorèmes de stabilité 37<br />

avec<br />

lim T(ε) = +∞.<br />

ε→0<br />

Pour n = 2, la théorie KAM nous donne donc des résultats de stabilité perpétuelle<br />

(ou de stabilité en temps infini), c’est-à-dire T(ε) = +∞ (et b = 1/2). On parlera de<br />

stabilité polynomiale (resp. exponentielle, resp. super-exponentielle) si T(ε) est d’ordre<br />

ε−k , k ∈ N∗ (resp. exp(ε−1 ), resp. exp exp(ε−1 )).<br />

1.4.2.2. Dans les années 1970, Nekhoroshev a démontré que, si le système est analytique<br />

et si la partie intégrable vérifie une condition "générique", alors il est exponentiellement<br />

stable. Voici un énoncé informelle.<br />

Théorème 1.15 (Nekhoroshev). On considère un système hamiltonien (∗), et on suppose<br />

que :<br />

(i) le système est analytique;<br />

(ii) h satisfait une condition "générique".<br />

Alors il existe des constantes c1, c2, c3, a, b, ε0 qui ne dépendent que de h telles que pour<br />

ε ≤ ε0,<br />

|I(t) − I0| ≤ c1ε b , |t| ≤ c2 exp c3ε −a ,<br />

pour toute action initiale I0 ∈ BR/2.<br />

Pour la preuve originale, on ne peut consulter que [Nek77], [Nek79].<br />

Les constantes a et b sont appelés exposants de stabilité, la valeur de a est la plus<br />

importante car elle fournit le temps précis de stabilité, il est de l’ordre de exp (ε −a ). La<br />

valeur donnée par Nekhoroshev est a ∼ n 2 , naturellement elle tend vers zéro lorsque n<br />

tend vers l’infini.<br />

L’hypothèse faite sur h par Nekhoroshev est une certaine condition de transversalité<br />

quantitative, que l’on appelle escarpement (ou raideur). Elle semble adaptée à la preuve<br />

et n’est donc pas très conceptuelle. En revanche, on a une caractérisation géométrique<br />

simple due à Ilyashenko et Niederman ([Ily86], [Nie06]) : la restriction de h à tout sousespace<br />

affine propre n’admet que des points critiques isolés. Notons que Nekhoroshev<br />

supposait également que h ne possède pas de points critiques et qu’il soit non dégénérée<br />

au sens de Kolmogorov, mais ces hypothèses sont en fait superflues ([Nie07b]).<br />

On a qualifié la condition sur h de générique : ce n’est pas très clair, à ma connaissance<br />

l’espace des fonctions non escarpées est de codimension infinie dans l’espace des fonctions<br />

de classe C ∞ . Une condition générique d’escarpement diophantien a été donnée par<br />

Niederman dans [Nie07b] : c’est une condition prévalente dans l’espace des fonctions de<br />

classe C k , k > 2n + 2, la prévalence étant une généralisation possible de la notion de<br />

mesure de Lebesgue pleine pour des espaces de dimension infinie.<br />

1.4.2.3. Expliquons maintenant rapidement les principales idées, toujours dans l’optique<br />

des trois difficultés de la théorie classique des perturbations.<br />

Tout d’abord, le problème des résonances ne peut pas être ignoré ici. La stratégie<br />

est alors de changer de système moyenné : on remplace la moyenne spatiale 〈f〉 par la<br />

moyenne temporelle le long du flot non perturbé [f], c’est-à-dire<br />

t<br />

1<br />

[f] = lim f ◦ Φ<br />

t→∞ t<br />

h <br />

sds ,<br />

0


38 Introduction<br />

et on considère le nouveau système moyenné [H] = h + [f]. Ce dernier n’est plus nécessairement<br />

intégrable : pour une action I ∈ B de fréquence ω = ∇h(I), on a<br />

t <br />

1<br />

[f](θ, I) = lim f(θ + sω, I)ds .<br />

t→∞ t<br />

Elle dépend donc de ω, et plus précisément de son module de résonance M(ω). Lorsque<br />

ω est non résonant, alors le flot linéaire de fréquence ω est ergodique et par le théorème<br />

(ergodique) de Birkhoff on retrouve [f] = 〈f〉, mais ce n’est pas vrai dans le cas général. Il<br />

est bon de garder en tête l’exemple des résonances standard : si ω = (ˆω, 0) ∈ R n−m ×R m<br />

avec ˆω non résonant, alors<br />

[f](θ, I) = f(θn−m+1, . . .,θn, I),<br />

et on peut déduire immédiatement que les variables d’action I1, . . .,In−m sont constantes<br />

(mais on ne sait rien de plus sur les variables In−m+1, . . .,In).<br />

Lorsque l’on compare le système H au système moyenné [H], on retrouve le problème<br />

des petits diviseurs, l’idée est alors de considérer des fréquences "résonantesdiophantiennes".<br />

Si l’on pense aux résonances standard (ˆω, 0), cela revient à demander<br />

que ˆω soit diophantienne. Plus précisément, à un sous-module M de Z n fixé (que l’on<br />

doit voir comme un module de résonance), on considère dans l’espace des actions le<br />

domaine "non résonant modulo M", BM, défini de la manière suivante : si I ∈ BM et<br />

ω = ∇h(I), alors ou bien k.ω = 0 pour k ∈ M, ou bien on a un contrôle sur |k.ω|<br />

pour les multi-entiers k /∈ M. C’est sur de tels domaines BM qu’il est alors possible de<br />

comparer H à [H].<br />

Il reste enfin le problème des grands multiplicateurs, à savoir la convergence du<br />

schéma itératif. Il n’y a malheureusement pas de solution ici, en général il y a divergence.<br />

Une astuce consiste alors à faire un nombre fini mais "asymptotiquement infini"<br />

d’étapes : à ε fixé, on fait un nombre d’étapes de l’ordre de ε −a , 0 < a < 1, et donc<br />

lorsque ε tend vers zéro, ce nombre tend vers l’infini. L’effet est le suivant : sur un<br />

domaine non résonant, on peut conjuguer H à [H] + ˜ f, où ˜ f est un terme général exponentiellement<br />

petit en ε −a . On en déduit alors un résultat de stabilité exponentielle<br />

mais seulement "local" et "partiel" : local dans le sens où il n’est valable que pour les<br />

solutions qui restent dans un domaine non résonant où la forme normale est valide, et<br />

partiel puisque comme le système moyenné n’est plus nécessairement intégrable, on ne<br />

peut contrôler l’évolution des variables d’action que dans certaines directions (penser au<br />

cas d’une résonance standard).<br />

Enfin un argument final très compliqué de "géométrie des résonances" permet de<br />

conclure (c’est certainement l’étape la plus difficile). Il consiste à recouvrir l’espace des<br />

phases par des domaines non résonants de manière adéquate, puis à l’aide d’une propriété<br />

de transversalité du système intégrable, on recolle les résultats locaux et partiels de<br />

stabilité valables sur chaque domaine en un résultat de stabilité globale.<br />

1.4.2.4. La preuve précédente est très compliquée, mais elle se simplifie dans le cas particulier<br />

où h est quasi-convexe (c’est-à-dire que ses sous-niveaux d’énergie sont convexes) :<br />

sur des domaines non résonants on obtient un résultat de stabilité locale mais non plus<br />

partielle, ce qui facilite par la suite la géométrie des résonances.<br />

Cependant, dans le cas quasi-convexe, on dispose d’une méthode "révolutionnaire"<br />

due à Lochak, qui est à la fois plus simple, plus élégante et donne des améliorations<br />

0


1.4 - Théorèmes de stabilité 39<br />

quantitatives importantes ([Loc92], voir aussi [Loc93] pour une exposition non technique<br />

des idées). Le point essentiel dans l’approche de Lochak est de ne considérer que les<br />

fréquences périodiques, c’est-à-dire les vecteurs ω ∈ R n pour lesquels il existe un réel<br />

T > 0 tel que<br />

Tω ∈ Z n .<br />

L’exemple le plus simple est donné par un vecteur dont les composantes sont rationnelles.<br />

Ce sont les résonances maximales, dans le sens où le module M(ω) est de rang n − 1<br />

(et réciproquement tout sous-module de Z n de rang n − 1 est de cette forme). Le terme<br />

périodique vient bien sûr du fait qu’un flot linéaire de fréquence ω a toutes ses orbites<br />

T-périodiques.<br />

Pour ce qui est de la construction de formes normales, elle est grandement simplifiée<br />

dans ce cas particulier car il n’y a pas de petits diviseurs. En effet, si k ∈ Z n \ {0} ne<br />

résonne pas avec ω, c’est-à-dire si k.ω = 0, alors<br />

|k.ω| ≥ T −1 ,<br />

et de plus la borne inférieure obtenue est uniforme en k. D’un autre point de vue,<br />

l’équation homologique ω.∂θχ = g se résout facilement par une formule intégrale<br />

χ(θ) = T −1<br />

T<br />

et les petits diviseurs n’apparaissent pas.<br />

0<br />

g(θ + sω)sds,<br />

En utilisant la quasi-convexité, on en déduit facilement un résultat de stabilité locale,<br />

puis à l’aide d’un théorème de Dirichlet sur l’approximation des vecteurs par des vecteurs<br />

à coordonnées rationnelles, on obtient de manière remarquablement simple le résultat<br />

de stabilité globale.<br />

Pour ce qui est des améliorations quantitatives, Lochak et Neishtadt ([LN92]) ont<br />

montré que l’on peut choisir les exposants de stabilité<br />

a = b = 1<br />

2n .<br />

On verra par la suite que la valeur de l’exposant a est quasiment optimale. De plus, l’approche<br />

de Lochak permet de mettre en évidence un phénomène particulier et extrêmement<br />

surprenant de stabilisation par les résonances : si une solution passe suffisamment<br />

proche d’une résonance de multiplicité m fixé, 0 ≤ m ≤ n − 1, alors on a des exposants<br />

de stabilité locaux<br />

am = bm =<br />

1<br />

2(n − m) .<br />

Bien que les résonances soient la cause principale de l’instabilité asymptotique, elles<br />

favorisent la stabilité en temps fini. Ajoutons que tous ces résultats peuvent également se<br />

retrouver par l’approche classique de Nekhoroshev, comme l’a montré Pöschel ([Pös93]).<br />

Notons enfin qu’en utilisant l’approche de Lochak, Marco et Sauzin ([MS02]) ont<br />

généralisé le résultat pour des hamiltoniens de classe α-Gevrey, α ≥ 1, en obtenant des<br />

exposants de stabilité<br />

a = 1 1<br />

, b =<br />

2αn 2n ,


40 Introduction<br />

et des exposants de stabilité locaux<br />

am =<br />

1<br />

2α(n − m) , bm =<br />

1<br />

2(n − m) .<br />

Puisque les fonctions 1-Gevrey sont exactement les fonctions analytiques, on obtient<br />

bien une généralisation des résultats précédents.<br />

1.5 Exemples d’instabilité<br />

Jusqu’à présent, on a vu que les solutions d’un système hamiltonien presqueintégrable<br />

sont stables sur une grande partie de l’espace des phases par le théorème<br />

KAM et que, dans le complémentaire, elles sont stables pendant un grand intervalle de<br />

temps par le théorème de Nekhoroshev. On s’intéresse maintenant à la limite de validité<br />

de ces résultats de stabilité, autrement dit on veut étudier le comportement des<br />

solutions, dans l’espace et le temps qui ne sont pas couverts par la théorie KAM et la<br />

théorie de Nekhoroshev.<br />

Le but de cette section est de décrire le mécanisme d’instabilité "globale" introduit<br />

par Arnold, ainsi que les diverses généralisations qui ont suivi. On insiste sur le fait<br />

qu’une grande partie des travaux récents sur l’instabilité utilisent des méthodes variationnelles<br />

introduites par Mather, mais que par manque de connaissances de l’auteur<br />

dans ce domaine, on va se limiter à la description de certains aspects "géométriques".<br />

Pour une jolie présentation du mécanisme d’Arnold, on pourra consulter [Mar96] ou<br />

[Ber06], et pour plus d’informations, la référence incontournable est [Loc99].<br />

1.5.1 Le mécanisme d’Arnold<br />

1.5.1.1. Le point de départ est l’exemple suivant. Considérons le hamiltonien défini sur<br />

T 3 × R 3 , qui dépend de deux paramètres positifs ε et µ :<br />

Hε,µ(θ, I) = 1<br />

2 (I2 1 + I2 2 ) + I3 + ε(cos 2πθ1 − 1) + εµ(cos 2πθ1 − 1)(cos 2πθ2 + sin 2πθ3).<br />

En 1964, Arnold a démontré le théorème suivant.<br />

Théorème 1.16. Pour tout ε > 0, il existe µ0(ε) tel que pour 0 < µ < µ0(ε), le système<br />

hamiltonien défini par Hε,µ possède une orbite (θ(t), I(t)) vérifiant<br />

pour un temps τ = τ(ε).<br />

|I(τ) − I(0)| ≥ 1,<br />

On appelle généralement "diffusion" d’Arnold ce phénomène d’instabilité, autrement<br />

dit une variation d’ordre 1 des variables d’action pour tout ε > 0, aussi petit soit-il. Le<br />

temps τ(ε) est appelé temps d’instabilité (ou de dérive, ou de diffusion). Le mécanisme<br />

introduit par Arnold pour prouver ce résultat est contenu dans [Arn64] (voir également<br />

[AA89]), et nous allons l’expliquer brièvement.


1.5 - Exemples d’instabilité 41<br />

1.5.1.2. Commençons par le cas ε = µ = 0. Alors<br />

H0,0(θ, I) = 1<br />

2 (I2 1 + I2 2 ) + I3<br />

est un système intégrable quasi-convexe (le plus simple possible), les variables d’action<br />

sont donc constantes.<br />

Maintenant, pour ε > 0, µ = 0, le système<br />

Hε,0(θ, I) = 1<br />

2 (I2 1 + I2 2 ) + I3 + ε(cos 2πθ1 − 1)<br />

reste intégrable au sens de Liouville, et il est facile de voir que les variables d’action<br />

I2, I3 sont constantes tandis que I1 a une variation maximale de 2 √ ε.<br />

La caractéristique la plus importante du système Hε,0 est qu’il possède des objets<br />

invariants hyperboliques. En effet, on voit facilement que c’est le produit direct d’un<br />

pendule (dans les coordonnées (θ1, I1)) et d’un système intégrable en action-angle. Le<br />

pendule possède un point fixe hyperbolique en {θ1 = 0, I1 = 0}, dont les variétés stable<br />

et instable (les séparatrices) coïncident. Ainsi, dans chaque niveau d’énergie, le système<br />

Hε,0 possède une famille continue à un paramètre de tores invariants de dimension 2<br />

partiellement hyperboliques : par exemple au niveau d’énergie zéro on trouve<br />

Ts = {θ1 = 0, I1 = 0, I2 = s, I3 = −s 2 /2}, s ∈ R.<br />

Chacun de ces tores possède des variétés stable et instable de dimension 3 qui coïncident<br />

(elles sont données par le produit direct du tore avec la variété stable et instable du point<br />

fixe hyperbolique). Pour des raisons de dimension, on voit facilement que l’hyperbolicité<br />

des tores est seulement partielle (ils ne sont pas normalement hyperboliques).<br />

Regardons maintenant ce qui peut se passer pour µ > 0. D’une part, on remarque<br />

que dans l’exemple d’Arnold les champs de vecteurs hamiltoniens engendrés par Hε,0 et<br />

Hε,µ coïncident en {θ1 = 0}, ainsi dans chaque niveau d’énergie on a encore une famille<br />

continue de tores partiellement hyperboliques pour Hε,µ.<br />

D’autre part, on peut montrer la propriété suivante : il existe µ0(ε) > 0 tel que pour<br />

0 < µ < µ0(ε), les variétés stable et instable des tores Ts se coupent transversalement<br />

(dans un niveau d’énergie) le long d’au moins une orbite homocline. Dans cet exemple,<br />

cela peut se montrer par un simple calcul d’intégrale (dite de Poincaré-Melnikov) mais<br />

cela nécessite de choisir µ0(ε) exponentiellement petit par rapport à ε.<br />

On obtient alors une famille à un paramètre de tores partiellement hyperboliques<br />

(Ts)s∈R qui possèdent des orbites homoclines transverses, et on peut en déduire que<br />

deux tores consécutifs suffisamment proches sont connectés par une orbite hétérocline<br />

transverse. On peut alors extraire de cette famille une suite finie de tores<br />

Ti = Tsi , 1 ≤ i ≤ N,<br />

avec les propriétés suivantes : pour chaque 1 ≤ i ≤ N, la dynamique sur le tore Ti<br />

est minimale et sa variété instable W +<br />

i coupe transversalement la variété stable W −<br />

i+1<br />

du tore Ti+1 le long d’une orbite hétérocline. C’est ce qu’on appelle une "chaîne de<br />

transition", et en choisissant s1 < 0 et sN > 1, on peut facilement en déduire l’existence<br />

d’une pseudo-orbite qui a une dérive d’ordre 1.


42 Introduction<br />

Il reste enfin à suivre cette pseudo-orbite, c’est-à-dire à construire une orbite qui longe<br />

la chaine de transition. Pour cela, Arnold introduit la notion de "propriété d’obstruction"<br />

que l’on peut énoncer ainsi : Ti possède cette propriété lorsque toute sous-variété M,<br />

invariante par le flot et qui coupe transversalement (en restriction à un niveau d’énergie)<br />

la variété instable W +<br />

−<br />

i , vérifie Wi ⊆ M. En utilisant le fait que les tores qui constituent<br />

la chaîne de transition possèdent cette propriété d’obstruction, un argument topologique<br />

simple permet de conclure à l’existence d’une orbite de dérive.<br />

1.5.1.3. Voici donc le mécanisme proposé par Arnold, tel qu’il l’a illustré sur un exemple<br />

simple mais concret. Il a également conjecturé que ce phénomène d’instabilité est "générique".<br />

Cependant, certains arguments qu’il utilise dans la construction de son exemple<br />

ne sont pas totalement justifiés, de plus, des difficultés apparaissent lorsque l’on tente<br />

d’utiliser ce mécanisme dans des situations plus générales.<br />

Premièrement, la propriété d’obstruction introduite par Arnold n’est pas démontrée,<br />

et en raison de l’hyperbolicité seulement partielle des tores, cela ne résulte pas immédiatement<br />

du lemme d’inclinaison (λ-lemme) classique valable pour les points fixes<br />

hyperboliques. De même, l’existence de connexions hétéroclines transverses entre deux<br />

tores suffisamment proches ne se déduit pas trivialement de l’existence d’orbites homoclines<br />

transverses. Les justifications sont venues plus tard, d’abord [Dou88] et [CG94]<br />

dans des cas particuliers, puis [Mar96] (voir aussi [Cre97]) pour un cas plus général qui<br />

s’applique à l’exemple d’Arnold. On dispose aujourd’hui d’un lemme d’inclinaison suffisamment<br />

général dans ce contexte partiellement hyperbolique (voir [FM00] et [FM01]).<br />

Cependant, on verra dans la suite que le cadre le plus adapté à ce genre de problèmes est<br />

celui des variétés normalement hyperboliques. On pourra également consulter [Bes96]<br />

pour une construction variationnelle d’une orbite de dérive dans l’exemple d’Arnold.<br />

Ensuite, une difficulté plus sérieuse apparaît si l’on cherche à construire des exemples<br />

plus généraux. Rappelons que dans l’exemple d’Arnold, le champ de vecteurs engendré<br />

par la perturbation en µ s’annule le long d’une sous-variété, ce qui permet de continuer<br />

à des valeurs µ > 0 la famille continue de tores partiellement hyperboliques présente<br />

à µ = 0. Pour une perturbation "générique", il n’est pas possible de conserver cette<br />

propriété. Cependant, une version du théorème KAM permet de montrer que beaucoup<br />

de tores partiellement hyperboliques persistent (voir [Gra74] par exemple), mais alors<br />

ils ne forment plus une famille continue : seuls persistent les tores avec une fréquence<br />

suffisamment non résonante. Il va alors se créer des "trous" entre deux tores invariants<br />

consécutifs, si bien que la variété stable d’un tore ne coupe plus nécessairement la variété<br />

instable du tore suivant : c’est le "large gap problem".<br />

Enfin, en vue de la généricité de ce mécanisme, la difficulté la plus sérieuse est la<br />

présence de deux paramètres de perturbation ε et µ que l’on ne peut pas choisir de<br />

la même manière. On voit clairement dans l’exemple que le paramètre ε introduit de<br />

l’hyperbolicité dans le système tout en conservant l’intégrabilité, tandis que le paramètre<br />

µ détruit l’intégrabilité et rend possible l’instabilité. Cependant, il est important de noter<br />

qu’on ne sait montrer l’existence d’orbites de dérive que si µ est choisi exponentiellement<br />

petit par rapport à ε. Autrement dit, il est plus correct de considérer l’exemple comme<br />

une perturbation de taille µ du hamiltonien<br />

Hε,0(θ, I) = H1(θ1, I1) + H2(I2, I3) = 1<br />

+ I3.<br />

2 I2 1 + ε(cos2πθ1 − 1) + 1<br />

2 I2 2<br />

Ainsi le mécanisme d’Arnold ne s’applique vraiment qu’aux perturbations d’un sys-


1.5 - Exemples d’instabilité 43<br />

tème intégrable (au sens de Liouville) partiellement hyperbolique, que l’on appelle "a<br />

priori instable" (suivant la terminologie introduite dans [CG94], on parle également de<br />

système initialement hyperbolique), par opposition aux systèmes intégrables en coordonnées<br />

action-angle où l’hyperbolicité est absente et qui sont "a priori stables" (ou<br />

initialement elliptiques).<br />

1.5.2 Résonances simples et résonances multiples<br />

On va maintenant expliquer comment l’exemple d’Arnold permet d’étudier plus généralement<br />

l’instabilité dans les systèmes hamiltoniens presque-intégrables.<br />

1.5.2.1. Considérons donc un système hamiltonien presque-intégrable suffisamment<br />

régulier de la forme<br />

H(θ, I) = h(I) + εf(θ, I), (θ, I) ∈ T n × R n ,<br />

où l’on a mis le petit paramètre ε en facteur pour spécifier plus simplement la taille<br />

de la perturbation. Pour m ≥ 1, considérons une action I∗ résonante de multiplicité m,<br />

cela veut dire que sa fréquence ω∗ = ∇h(I∗) est m-résonante. Par un changement de<br />

variables symplectique linéaire, on peut se ramener au cas où ω∗ = (0, ˆω∗) ∈ R m ×R n−m<br />

avec ˆω∗ non résonante, et on va supposer que cette dernière est diophantienne. On peut<br />

alors appliquer la théorie des formes normales résonantes : au voisinage de I∗, en notant<br />

I = (I1, I2) ∈ R m × R n−m et (θ1, θ2) ∈ T m × T n−m , le système est conjugué à<br />

H(θ, I) = h(I) + εg(θ1, I) + µ(ε) ˜ f(θ, I),<br />

avec µ(ε) que l’on peut rendre petit devant ε (exponentiellement petit si le système est<br />

analytique). En négligeant ce dernier terme, on obtient le système moyenné :<br />

[H](I) = h(I) + εg(θ1, I).<br />

Si on note V (θ1) = g(θ1, 0), en effectuant une translation pour ramener I∗ en 0 et un<br />

développement limité en 0, la partie principale du hamiltonien moyenné est décrite par<br />

la somme de deux hamiltoniens<br />

H∗(θ, I) = H1(θ1, I1) + H2(I2) = AI1.I1 + εV (θ1) + ˆω∗I2 + BI2.I2,<br />

où A et B sont des matrices symétriques de taille respective m et n − m (voir [LMS03]<br />

pour plus de détails sur ces transformations). Le hamiltonien H1 décrit un pendule<br />

généralisé (non intégrable si m ≥ 2), et le hamiltonien H2 est un rotateur.<br />

1.5.2.2. Pour les résonances simples (m = 1), on retrouve en particulier l’exemple<br />

d’Arnold (A = 1, V (θ1) = 1 − cos 2πθ1). Plus généralement, si le potentiel V : T −→<br />

R possède un maximum non dégénéré, alors le système H∗ possède dans tout niveau<br />

d’énergie une famille continue de tores de dimension (n−1) partiellement hyperboliques.<br />

Ce sont les systèmes a priori instables, ainsi ces derniers modélisent la dynamique d’un<br />

système presque-intégrable au voisinage d’une résonance simple. Néanmoins, la taille de<br />

ce voisinage n’est que d’ordre √ ε, on ne peut donc pas directement obtenir des résultats<br />

d’instabilité pour les systèmes a priori stables en étudiant les systèmes a priori instables<br />

(voir cependant [Ber09] pour une nouvelle approche).


44 Introduction<br />

Si l’on choisit une perturbation de H∗ qui s’annule le long des tores hyperboliques, on<br />

peut alors appliquer le mécanisme d’Arnold pour montrer l’existence d’orbites de dérive,<br />

mais pour une perturbation "générique", on se retrouve face au "large gap problem".<br />

Cependant, la situation est désormais bien comprise, et tient beaucoup à la remarque<br />

suivante : au lieu de considérer individuellement chaque tore invariant partiellement<br />

hyperbolique, il est plus judicieux de remarquer que leur réunion forme une variété<br />

invariante normalement hyperbolique au sens de [HPS77], et qui va donc se continuer<br />

beaucoup plus facilement aux perturbations du système H∗.<br />

L’utilisation des variétés normalement hyperboliques est un ingrédient essentiel de la<br />

solution au "large gap problem" proposée par Delshams, de la Llave et Seara ([DdlLS06],<br />

voir aussi [DH09] pour un résultat plus général), dans le cas n = 3. En utilisant des outils<br />

de la théorie classique des perturbations (moyennisations résonantes, théorie KAM) ils<br />

arrivent à obtenir une description détaillée de la dynamique restreinte à la variété normalement<br />

hyperbolique perturbée : pour remédier aux problèmes des "trous", créés par<br />

les résonances, ils montrent que ces dernières engendrent d’autres tores invariants (tores<br />

"secondaires") qu’il est possible d’incorporer dans une chaine de transition généralisée.<br />

La dynamique transverse, engendrée par les intersections homoclines et hétéroclines de<br />

la variété normalement hyperbolique, est alors étudiée plus globalement à l’aide de la<br />

"scattering map" (qui n’est en fait pas une application mais une correspondance). Enfin,<br />

la construction d’orbites de dérive s’obtient par un lemme d’inclinaison partiellement hyperbolique,<br />

bien que dans ce cadre il soit plus commode d’utiliser un lemme d’inclinaison<br />

normalement hyperbolique ([ESM10]).<br />

Notons également que d’autres méthodes géométriques ont permis de résoudre le<br />

"large gap problem". D’une part, citons les travaux de Treschev ([Tre04], [PT07]), et<br />

d’autre part, les travaux de Gidea et Robinson ([GR07], [GR09]) où l’on utilise les orbites<br />

d’instabilité de Birkhoff pour traverser les "trous" (dans un cadre très simplifié).<br />

Enfin, suite aux travaux de Mather (en particulier [Mat91] et [Mat93]), signalons que<br />

les meilleurs résultats ont été obtenus par Cheng et Yan ([CY04], [CY09]) et Bernard<br />

([Ber08]).<br />

1.5.2.3. Pour des résonances multiples (m ≥ 2), le système moyenné n’est plus intégrable<br />

et on ne sait dire que bien peu de choses (à ma connaissance), que ce soit sur des<br />

exemples ou pour des classes générales de systèmes hamiltoniens. Pour le moment, on se<br />

limite essentiellement aux résonances doubles dans les systèmes à trois degrés de liberté,<br />

pour lesquels les seuls résultats généraux sont ceux annoncés par Mather ([Mat04]) ainsi<br />

que ceux annoncés très récemment par Marco.<br />

Un cas très particulier et relativement facile de résonances doubles (aussi connu sous<br />

le nom de "systèmes à trois échelle de temps"), qui correspond à l’intersection de deux<br />

résonances simples, l’une "forte" et l’autre "faible", a été étudié dans [Hal95], [Hal97].<br />

Notons aussi un joli exemple d’échange de résonances simples, dû à Bessi ([Bes97a]).<br />

Il construit, toujours par des méthodes variationnelles, une orbite qui dérive le long<br />

d’une résonance simple, s’approche d’une résonance double puis dérive le long d’une<br />

autre résonance simple. Un exemple similaire d’échange de résonances est également<br />

possible par des méthodes géométriques ([Bou09a]).<br />

Enfin, pour conclure, notons bien sûr que c’est la dynamique au voisinage des réso-


1.5 - Exemples d’instabilité 45<br />

nances doubles (et multiples) que l’on doit étudier et comprendre, en vue notamment<br />

de la conjecture de généricité d’Arnold. Il en est de même si l’on cherche à mettre en<br />

évidence des phénomènes d’instabilité plus marquants, comme la question suivante posée<br />

par Herman ([Her98]) : construire un hamiltonien de classe C∞ , Cr-proche pour<br />

r ≥ 2 du hamiltonien intégrable h(I) = 1<br />

2 |I|2 , qui possède une orbite dense sur le niveau<br />

d’énergie H−1 (1) (il s’agit d’une version de l’hypothèse quasi-ergodique). Des progrès<br />

récents dans cette direction ont été effectués d’abord par Kalsohin, Levi et Saprykina<br />

dans [KLS10] où ils construisent une orbite qui est dense dans un sous-ensemble (d’un<br />

niveau d’énergie) de dimension de Hausdorff maximale, puis par Kaloshin, Zhang et<br />

Zheng ([KZZ09]) où ils annoncent la construction d’une orbite qui est dense dans un<br />

sous-ensemble dont la mesure est plus grande que 1<br />

2 .<br />

1.5.3 Temps d’instabilité<br />

On va désormais s’intéresser à des aspects plus quantitatifs sur les phénomènes d’instabilité,<br />

plus précisément à des estimations sur le temps d’instabilité pour des systèmes<br />

analytiques (ou Gevrey).<br />

1.5.3.1. Commençons par le cas a priori stable, c’est-à-dire les perturbations de taille<br />

ε d’un système hamiltonien presque-intégrable.<br />

Rappelons que par le théorème de Nekhoroshev, les solutions sont stables pendant<br />

un temps T(ε) exponentiellement long. On obtient ainsi une minoration sur le temps<br />

d’instabilité<br />

τ(ε) > T(ε) ∼ exp ε −a , a ∼ 1<br />

n 2.<br />

Comme on l’a déjà expliqué, dans le cas quasi-convexe ce résultat fut ensuite amélioré<br />

par Lochak, puis par Marco et Sauzin dans le cas Gevrey. On obtient<br />

τ(ε) > T(ε) ∼ exp ε −a , a = 1<br />

2n ,<br />

et plus généralement dans le cas α-Gevrey, α ≥ 1,<br />

τ(ε) > T(ε) ∼ exp ε −a , a = 1<br />

2αn .<br />

Longtemps avant, après avoir étudié l’exemple d’Arnold, Chirikov ([Chi79], à qui on<br />

doit le terme diffusion d’Arnold) avait prédit que la valeur a = (2n) −1 devait être<br />

optimale en régularité analytique : il a formulé plusieurs conjectures sur les liens entre le<br />

temps d’instabilité et l’écart ("splitting") des variétés invariantes des tores partiellement<br />

hyperboliques, qui suggère un exposant d’instabilité égal à (2n) −1 . On verra dans cette<br />

thèse que ceci n’est pas tout à fait exact.<br />

En ce qui concerne la majoration du temps d’instabilité, on sait aujourd’hui montrer<br />

que<br />

<br />

τ(ε) ∼ exp ε −a′<br />

, a ′ 1<br />

=<br />

2(n − 2) .<br />

On doit ce résultat à plusieurs auteurs : Bessi ([Bes96], [Bes97b]) pour les cas n = 3 et<br />

n = 4, Herman, Marco et Sauzin ([MS02]) dans le cas Gevrey non analytique, Lochak et


46 Introduction<br />

Marco ([LM05]) qui ont obtenu (2(n − 3)) −1 dans le cas analytique, et enfin Ke Zhang<br />

([Zha09]) qui a obtenu le résultat général.<br />

En conclusion, l’exposant de stabilité optimal a et l’exposant d’instabilité optimal<br />

a ′ vérifient<br />

1<br />

2n ≤ a < a′ 1<br />

≤<br />

2(n − 2) ,<br />

et dans le cas Gevrey<br />

1<br />

2αn ≤ a < a′ ≤<br />

1<br />

2α(n − 2) .<br />

1.5.3.2. Dans le cas a priori instable, le théorème de Nekhoroshev ne s’applique pas<br />

(directement) et on ne connait donc pas de borne inférieure sur le temps d’instabilité<br />

τ(µ) dans ce contexte. Néanmoins, Lochak a conjecturé que le temps d’instabilité devait<br />

être polynomial en µ (parce que le "splitting" est polynomial en µ), et ceci malgré les<br />

premières estimations super-exponentielles obtenues dans [CG94].<br />

La première estimation réaliste a été obtenue par Marco ([Mar96]), en utilisant la<br />

méthode des fenêtres d’Easton ([Eas78], [Eas81]). En adaptant l’exemple de Bessi, Bernard<br />

a obtenu une majoration de l’ordre de µ −2 ([Ber96], voir également [Cre01] pour<br />

la même estimation avec des méthodes géométriques).<br />

Dans [Loc99], Lochak conjecture que le temps optimal doit être µ −1 ln µ −1 , et ce fut<br />

complètement démontré par Berti, Bolle et Biasco ([BBB03]) par des méthodes variationnelles<br />

introduites par Bessi. Des mécanismes géométriques avec ce temps d’instabilité<br />

ont également été proposés par Cresson et Guillet ([CG03]) ainsi que par Treschev<br />

([Tre04]).<br />

1.6 Au voisinage d’un tore invariant linéairement stable<br />

Pour conclure, mentionnons rapidement un cadre un peu différent, que l’on peut<br />

qualifier de perturbation de système hamiltonien intégrable linéaire : il s’agit d’étudier<br />

la dynamique au voisinage d’un tore invariant quasi-périodique linéairement stable, isotrope<br />

et réductible. On va se contenter des deux cas particuliers extrêmes, à savoir celui<br />

d’un tore invariant quasi-périodique lagrangien (tore de dimension n) et celui d’un point<br />

fixe elliptique (tore de dimension 0), où les hypothèses d’isotropie et de réductibilité sont<br />

automatiques.<br />

1.6.1 Au voisinage d’un tore lagrangien quasi-périodique<br />

1.6.1.1. Commençons par le cas d’un tore invariant quasi-périodique lagrangien. Soit<br />

(M, Ω) une variété symplectique quelconque de dimension 2n, H : M → R un hamiltonien<br />

qui possède un tore invariant lagrangien dont la dynamique est quasi-périodique<br />

de fréquence ω ∈ R n . Il existe donc un plongement Φ : T n → M dont l’image T est<br />

invariante par le champ XH, et qui de plus conjugue la restriction de XH à T à un flot<br />

linéaire de fréquence ω sur T n . Puisque le tore est supposé lagrangien donc isotrope,<br />

par un théorème classique de Weinstein on peut identifier (de manière symplectique) un<br />

voisinage du tore avec T n ×R n et placer le tore en T n ×{0}. Si l’on fait un développement


1.6 - Au voisinage d’un tore invariant linéairement stable 47<br />

limité du hamiltonien dans ces coordonnées (θ, I) ∈ T n × R n , on obtient par invariance<br />

du tore et de la fréquence<br />

H(θ, I) = ω.I + F(θ, I),<br />

où F(θ, I) = O(|I| 2 ). Considérons la partie principale<br />

h(I) = ω.I.<br />

C’est un système intégrable en coordonnées action-angle, dont la fréquence est désormais<br />

constante égale à ω. Le système complet peut se voir comme une perturbation de ce<br />

système intégrable linéaire, la taille de la perturbation étant essentiellement la distance<br />

au tore invariant.<br />

1.6.1.2. Si l’on suppose que ω est non résonante, alors la théorie classique des perturbations<br />

s’applique : pour tout entier m ∈ N ∗ , il existe un difféomorphisme symplectique<br />

Φm de T n × R n qui fixe T n × {0} et tel que<br />

H ◦ Φm(θ, I) = Hm(I) + O(|I| m+1 ),<br />

où Hm est un polynôme de degré m. Ces polynômes Hm, m ∈ N ∗ , sont les invariants de<br />

Birkhoff. En réalité, on peut construire une série formelle H∞(I) ainsi qu’un difféomorphisme<br />

symplectique formel Φ∞ de T n × R n fixant le tore T n × {0} et conjuguant H à<br />

H∞. Cependant, la transformation Φ∞, ainsi que de la forme normale H∞ divergent en<br />

général (ce sont des résultats de Siegel, voir [AKN06], et Pérez-Marco, [PM03]). Il existe<br />

néanmoins deux moyens d’obtenir des informations sur la dynamique au voisinage de ce<br />

tore en utilisant cette forme normale.<br />

1.6.1.3. On peut supposer le système analytique (ou Gevrey) et la fréquence ω diophantienne.<br />

Dans ce cas si ρ est la distance au tore invariant, en choisissant le nombre<br />

de normalisations m = m(ρ) convenablement (tendant vers l’infini lorsque ρ tend vers<br />

0, comme en théorie de Nekhoroshev) on obtient une forme normale avec un reste exponentiellement<br />

petit en ρ. On en déduit facilement un résultat de stabilité pendant un<br />

temps exponentiel : chaque solution qui part à une distance au plus ρ du tore invariant<br />

reste à une distance d’ordre ρ pendant un temps T(ρ) qui est de l’ordre de exp(ρ −1 )<br />

(voir [GG85], [BGG85] et [Fas90]).<br />

1.6.1.4. Une autre possibilité revient à faire un nombre fini m ≥ 2 de normalisations<br />

et à considérer<br />

H ◦ Φm(θ, I) = Hm(I) + O(|I| m+1 ).<br />

On se ramène ainsi à une perturbation d’un système intégrable non linéaire, la partie<br />

intégrable étant donnée par le polynôme Hm. En supposant Hm non dégénéré ou escarpé,<br />

la théorie KAM ou de Nekhoroshev s’applique. La première nous donne l’existence de<br />

beaucoup de tores invariants dans un voisinage, tandis que la seconde donne un résultat<br />

de stabilité pendant un temps exponentiellement long.<br />

1.6.1.5. C’est un fait assez remarquable, constaté par Morbidelli et Giorgilli ([MG95]),<br />

que l’on peut combiner la théorie "linéaire" avec les théories "non linéaires", dans le<br />

sens où l’on peut d’abord construire une forme normale de Birkhoff avec un reste exponentiellement<br />

petit puis ensuite appliquer la théorie KAM ou la théorie de Nekhoroshev<br />

pour obtenir des informations plus précises. On montre alors (assez facilement) que


48 Introduction<br />

dans un petit voisinage du tore invariant, la mesure du complémentaire des tores invariants<br />

est exponentiellement petite (on parle de "condensation exponentielle") et qu’on<br />

a stabilité pendant un temps super-exponentiellement long (on dit que les tores sont<br />

"super-exponentiellement collants").<br />

1.6.2 Au voisinage d’un point fixe elliptique<br />

1.6.2.1. Considérons maintenant le cas d’un point fixe elliptique. Le problème étant<br />

local, on peut se placer sur R 2n muni de sa structure symplectique standard, et placer<br />

le point fixe à l’origine. De plus le hamiltonien étant défini à une constante additive<br />

près, on peut supposer que H(0) = 0, et puisqu’un point fixe pour le flot hamiltonien<br />

n’est rien d’autre qu’un point critique du hamiltonien, on a dH(0) = 0. En faisant un<br />

développement limité du hamiltonien en 0 on obtient<br />

H(z) = H2(z) + O(|z| 3 )<br />

où z est proche de 0 ∈ R 2n , et H2 est la partie quadratique de H en 0. On suppose que<br />

le point fixe est elliptique, autrement dit linéairement (ou plutôt spectralement) stable,<br />

ce qui revient à demander que le spectre de la matrice J0H2 soit purement imaginaire<br />

(J0 étant la structure complexe canonique de R 2n ). Si l’on note {±iα1, . . .,±iαn} ce<br />

dernier, et si l’on suppose qu’il est simple (c’est-à-dire si les αi sont tous distincts), on<br />

peut effectuer un changement de variables linéaire symplectique pour se ramener à<br />

où l’on a noté<br />

H(z) =<br />

Ĩ =<br />

n<br />

i=1<br />

αi<br />

2 (z2 i + z2 n+i ) + O(|z|3 ) = α. Ĩ + O(|z|3 ),<br />

Ĩ(z) = 1<br />

2 (z2 1 + z 2 n+1, . . .,z 2 n + z 2 2n) ∈ R n +<br />

ce que l’on appellera des "actions formelles". Le vecteur α = (α1, . . .,αn) ∈ Rn est appelé<br />

fréquence caractéristique (ou fréquence normale). Considérons le terme principale de H<br />

donnée par<br />

h(z) = h( Ĩ) = α.Ĩ,<br />

alors on peut également le qualifier de système intégrable linéaire, dans le sens où ses<br />

solutions sont toutes quasi-périodiques de même fréquence.<br />

1.6.2.2. Cependant, on remarque que la dimension des tores invariants n’est pas<br />

constante. En effet, le long de chaque solution z(t) les actions formelles sont constantes,<br />

donc si l’on se donne une condition initiale z0 ∈ R 2n et si l’on note Ĩ0 = Ĩ(z0), les<br />

ensembles<br />

T Ĩ 0 = {z ∈ R2n | Ĩ(z) = Ĩ0 }<br />

sont des tores invariants et la dynamique est quasi-périodique. Comme pour les systèmes<br />

intégrables en action-angle, l’espace des phases se décompose en tores invariants<br />

R 2n = <br />

Ĩ 0 ∈R n +<br />

T Ĩ 0,


1.6 - Au voisinage d’un tore invariant linéairement stable 49<br />

mais la dimension des tores n’est plus fixe. Si l’on note Ĩ0 i la i-ème composante de Ĩ0 ,<br />

pour 1 ≤ i ≤ n, on peut écrire<br />

de sorte que<br />

T Ĩ 0 = {z ∈ R2n | z 2 i + z2 n+i = 2Ĩ0 i<br />

dimT Ĩ 0 = card{Ĩ0 i ∈ R+ | Ĩ0 i<br />

, 1 ≤ i ≤ n},<br />

> 0}<br />

varie de 0 (pour le point fixe) à n (tore lagrangien). Les tores de dimension intermédiaire<br />

sont appelés elliptiques ou linéairement stables.<br />

Considérons l’ouvert dense<br />

O = {z ∈ R 2n | Ĩi(z) > 0, 1 ≤ i ≤ n}.<br />

On peut utiliser les coordonnées polaires symplectiques (θ, I) = (θ, Ĩ), où θ ∈ Tn est<br />

défini par<br />

zi = 2Ii cosθi, zi+n = 2Ii sin θi,<br />

pour 1 ≤ i ≤ n. Ce changement de variables symplectique est bien défini et analytique<br />

sur O, et l’on se ramène à un système intégrable en action-angle comme dans la section<br />

précédente (feuilletage en tores lagrangiens). Sur cet ouvert, les actions formelles sont<br />

de "vraies" actions. Cependant, sur le complémentaire de O, le changement de variables<br />

précédent est singulier et cela explique le fait que la dimension des tores varie.<br />

Revenons maintenant à notre système complet<br />

H(z) =<br />

n<br />

i=1<br />

αi<br />

2 (z2 i + z 2 n+i) + O(|z| 3 ) = α. Ĩ + O(|z|3 ),<br />

que l’on voit comme une perturbation du système intégrable, la taille de la perturbation<br />

étant donnée par la distance au point fixe.<br />

1.6.2.3. La théorie de Birkhoff marche de manière identique, bien que l’on ait des<br />

variables cartésiennes z ∈ R 2n et non des variables action-angle (θ, I) ∈ T n × R n ; c’est<br />

en fait dans ce contexte que les formes normales de Birkhoff sont apparues. On dispose<br />

même d’un avantage supplémentaire : pour construire une forme normale de Birkhoff<br />

H ◦ Φm(z) = Hm( Ĩ) + O(|Ĩ|m+1 ),<br />

il suffit de demander une condition de non-résonance à l’ordre 2m sur α, c’est-à-dire<br />

k.α = 0, k ∈ Z n , 0 < |k| ≤ 2m.<br />

Avec une condition de non-résonance à tout ordre, on peut construire les polynômes hm<br />

pour tout m ∈ N ∗ ainsi que la série formelle (généralement divergente) h∞. Avec une<br />

condition diophantienne sur la fréquence, on construit une forme normale avec un reste<br />

exponentiellement petit et on en déduit de la stabilité pendant un temps exponentiellement<br />

long (voir [GDF + 89]).<br />

1.6.2.4. En ce qui concerne la théorie KAM, il est possible d’utiliser les coordonnées<br />

action-angle sur l’ouvert O pour trouver des tores lagrangiens arbitrairement proches du<br />

point fixe elliptique. De plus, les arguments sur la mesure des tores préservés subsistent


50 Introduction<br />

(voir [Pös82]), et on peut également combiner la théorie de Birkhoff et la théorie KAM<br />

pour obtenir des résultats améliorés ([DG96b]).<br />

Pour n = 2, sous la condition de non-dégénérescence iso-énergétique, cela implique la<br />

stabilité du point fixe au sens de Lyapunov ([Arn63b], voir aussi [Arn61] pour d’autres<br />

résultats), mais pour n ≥ 3, le mécanisme d’instabilité d’Arnold fournit des exemples de<br />

points fixes elliptiques topologiquement instables ([DLC83], [Dou88], voir aussi [KMV04]<br />

et [KZZ09]).<br />

Rappelons également que dans le complémentaire de l’ouvert O, le système intégrable<br />

possède des tores invariants linéairement stables dont la dimension n’est pas maximale.<br />

En effet, pour 1 ≤ m < n, si on se donne une condition initiale z0 ∈ R 2n telle que<br />

Ĩ 0 1 , . . .,Ĩ0 m > 0, Ĩ0 m+1 = · · · = Ĩ0 n<br />

alors elle évolue sur le tore TĨ0 de dimension m. On peut alors introduire des variables<br />

action-angle (θ1, . . .,θm, I1, . . .,Im) pour les m premières composantes, et écrire la partie<br />

principale (intégrable) du hamiltonien sous la forme<br />

H(θ, I, z) =<br />

m<br />

i=1<br />

αiIi +<br />

n<br />

i=m+1<br />

= 0,<br />

αi Ĩi, (θ, I, z) ∈ T m × R m × R 2(n−m) .<br />

De manière générale, si un système hamiltonien sur une variété symplectique quelconque<br />

possède un tore invariant quasi-périodique linéairement stable (isotrope, réductible),<br />

alors sa partie principale peut s’écrire sous cette forme. On peut également se demander<br />

si ces tores invariants persistent.<br />

Commençons par le cas particulier m = 1, où l’on a une famille d’orbites périodiques<br />

contenue dans un disque invariant contenant le point fixe elliptique. Si<br />

α1<br />

αi<br />

/∈ Z, i = 2, . . .,n,<br />

alors un théorème de Lyapunov assure que la famille d’orbites périodiques persiste (voir<br />

[SM71]). Dans ce cas particulier, il n’y a pas de petits diviseurs, mais ils apparaissent<br />

pour m > 1.<br />

Pour m = n − 1, le résultat de persistance est dû à Moser ([Mos67]) et dans le<br />

cas plus difficile où m est quelconque, sous des hypothèses diophantiennes adéquats, le<br />

résultat est dû à Eliasson ([Eli88], voir aussi [Pös89]). Ajoutons que tous ces résultats<br />

avaient été annoncés auparavant par Melnikov ([Mel65]).<br />

1.6.2.5. En ce qui concerne la théorie de Nekhoroshev, elle s’applique beaucoup plus<br />

difficilement au voisinage des points elliptiques, en raison de la singularité des variables<br />

action-angle. Ce résultat fut seulement conjecturé par Nekhoroshev, puis partiellement<br />

démontré par Lochak ([Loc92], [Loc95]) dans le cas quasi-convexe. Une preuve complète,<br />

toujours dans le cas quasi-convexe, est due indépendamment à Niederman ([Nie98]) et<br />

Fasso, Guzzo, Benettin ([FGB98], [GFB98]). Les deux preuves utilisent les coordonnées<br />

cartésiennes, la première est une adaptation de la méthode de Lochak dans ce contexte<br />

(voir aussi [Pös99b]) tandis que la seconde utilise les arguments classiques de Nekhoroshev.


2.1 - Résultats de stabilité 51<br />

2 Résultats et questions<br />

Dans cette section, nous allons exposer les résultats de cette thèse de manière assez<br />

informelle (avec un renvoi aux différents chapitres pour des formulations précises), et<br />

nous allons aussi expliquer quelques questions issues de ces travaux.<br />

La plupart des énoncés concernent des systèmes hamiltoniens presque-intégrables de<br />

la forme <br />

H(θ, I) = h(I) + f(θ, I)<br />

|f|D < ε


52 Résultats et questions<br />

Question 1. Donner une condition générique minimale pour avoir stabilité exponentielle.<br />

La condition générique minimale devrait être la suivante : la restriction de h à tout<br />

sous-espace affine, dont la direction est engendrée par des vecteurs à coefficients entiers,<br />

n’admet que des points critiques isolés. C’est une condition nécessaire, d’après les<br />

travaux de Nekhoroshev et Niederman ([Nek79], [Nie06]), il reste donc à montrer que<br />

c’est une condition suffisante, ce qui me semble possible, bien que certains problèmes<br />

"d’uniformité" se posent encore.<br />

Outre le fait d’être plus simple, la preuve proposée dans le troisième chapitre présente<br />

de nombreux avantages, comme de se transposer facilement dans d’autres cadres et de<br />

fournir des résultat plus généraux. Dans le quatrième chapitre, on étudie des résultats de<br />

stabilité au voisinage des points fixes elliptiques et plus généralement des tores invariants<br />

linéairement stables (avec les hypothèse habituelles).<br />

Considérons un point fixe elliptique dans R 2n , avec une fréquence normale α ∈ R n<br />

non résonante à l’ordre 4, et écrivons sa forme normale de Birkhoff :<br />

H(z) = α. Ĩ + βĨ.Ĩ + f(z),<br />

où z ∈ R 2n , β ∈ Sn(R) est une matrice symétrique. On a alors le résultat suivant (voir<br />

Théorème 4.2).<br />

Résultat 3. Pour presque toute matrice symétrique β ∈ Sn(R), le point fixe est exponentiellement<br />

stable.<br />

Notons que ce résultat est plus général que les précédents ([Nie98], [FGB98], [GFB98],<br />

[Pös99b]) qui ne sont valables que si β est définie positive.<br />

De plus, en utilisant une technique de Morbidelli et Giorgilli valable dans le cas<br />

quasi-convexe et pour des tores lagrangiens, on obtient des résultats de stabilité superexponentielle<br />

générique.<br />

On suppose que α est non résonant, et on considère la série formelle de Birkhoff h∞,<br />

on obtient alors le résultat suivant (voir Théorème 4.1).<br />

Résultat 4. Pour un ensemble prévalent de séries formelles h∞, le point fixe est superexponentiellement<br />

stable.<br />

Considérons maintenant un tore invariant lagrangien quasi-périodique, de fréquence<br />

ω ∈ R n diophantienne, et soit h∞ la série formelle de Birkhoff. De manière analogue, on<br />

a le résultat suivant (voir Théorème 4.4).<br />

Résultat 5. Pour un ensemble prévalent de séries formelles h∞, le tore lagrangien est<br />

super-exponentiellement stable.<br />

Enfin on peut également énoncer un théorème généralisant les deux précédents. On<br />

considère un tore invariant linéairement stable (normalement elliptique), isotrope et<br />

réductible. On note ω ∈ R k sa fréquence "interne", α ∈ R l sa fréquence normale et<br />

on suppose que le vecteur (ω, α) ∈ R k+l est diophantien. Dans ce contexte, on peut<br />

également prouver l’existence de formes normales de Birkhoff, et définir la série formelle<br />

h∞. On obtient alors le résultat suivant (voir Théorème 4.5).


2.1 - Résultats de stabilité 53<br />

Résultat 6. Pour un ensemble prévalent de séries formelles h∞, le tore linéairement<br />

stable est super-exponentiellement stable.<br />

Notons que dans les résultats précédents, la condition est formulée dans l’espace<br />

des formes normales (séries formelles) de Birkhoff. On peut alors se poser la question<br />

suivante.<br />

Question 2. Formuler une condition générique dans l’espace des hamiltoniens pour<br />

garantir la stabilité super-exponentielle.<br />

De part l’unicité des invariants de Birkhoff une fois la partie quadratique fixée,<br />

l’application, qui à un germe d’hamiltonien associe la série formelle des invariants de<br />

Birkhoff, est bien définie. Il faudrait maintenant étudier dans quel sens cette application<br />

peut être "régulière".<br />

2.1.2 Cas quasi-convexe<br />

Ici on s’intéresse au cas où la partie intégrable h est quasi-convexe. Si le système<br />

est analytique ou Gevrey, on a des résultats de stabilité exponentielle, mais qui ne sont<br />

pas valables si le système est seulement de classe C k , k ∈ N (la situation devrait être la<br />

même en classe C ∞ , bien que cela ne soit pas démontré).<br />

En revanche, pour des systèmes de régularité finie, on va prouver un résultat de<br />

stabilité polynomiale (Théorème 5.1) dans le cinquième chapitre.<br />

Résultat 7. On suppose que le système est de classe C k , pour k ≥ 3. Si h est quasiconvexe,<br />

et si ε est suffisamment petit, alors le système H = h + f est polynomialement<br />

stable avec les exposants<br />

a =<br />

k − 2 1<br />

, b =<br />

2n 2n .<br />

Autrement dit, il existe des constantes positives c1, c2 qui ne dépendent que h telles<br />

que pour ε suffisamment petit, pour toute action initiale I0,<br />

|I(t) − I0| ≤ c1ε 1<br />

k−2<br />

2n,<br />

−<br />

|t| ≤ c2ε 2n .<br />

Rappelons que dans le cas quasi-convexe, on a des résultats améliorés aux voisinages<br />

des résonances. C’est également vrai en différentiabilité finie (voir Théorème 5.2).<br />

Résultat 8. Pour m ∈ {0, . . .,n−1}, si l’action initiale est proche d’une résonance de<br />

multiplicité m, on peut choisir les exposants<br />

a =<br />

k − 2<br />

, b =<br />

2(n − m)<br />

1<br />

2(n − m) .<br />

Comme en régularité Gevrey, les résultats précédents ne concernent que des hamiltoniens<br />

intégrables quasi-convexes.<br />

Question 3. Démontrer un résultat de stabilité exponentielle (resp. polynomiale) pour<br />

un hamiltonien intégrable générique de classe Gevrey (resp. C k ).<br />

Il n’y aucun doute sur le fait que de tels résultats existent, cependant certaines difficultés<br />

techniques apparaissent lorsque l’on utilise des compositions de moyennisations<br />

(périodiques ou non) en régularité non analytique. Le cas Gevrey semble plus facile.


54 Résultats et questions<br />

2.2 De la stabilité à l’instabilité<br />

On s’intéresse maintenant à la transition entre la stabilité et l’instabilité, c’est-àdire<br />

que l’on cherche un temps de stabilité maximale qui coïnciderait avec le temps<br />

d’instabilité minimale. On se limite au cas où le hamiltonien intégrable h est quasiconvexe.<br />

2.2.1 Cas analytique<br />

On suppose ici que le système est analytique. On a alors stabilité exponentielle avec<br />

des exposants<br />

a = 1 1<br />

, b =<br />

2n 2n .<br />

Dans le sixième chapitre, on va améliorer la valeur de ces exposants (voir Théorème 6.1).<br />

Résultat 9. Pour<br />

0 < δ ≤<br />

on a stabilité exponentielle avec les exposants<br />

a =<br />

1<br />

2n(n − 1) ,<br />

1<br />

− δ, b = δ(n − 1).<br />

2(n − 1)<br />

Pour δ = (2n(n−1)) −1 , on retrouve le résultat précédent et lorsque ce dernier décroît<br />

vers zéro tout en restant positif, l’exposant de stabilité a devient arbitrairement proche<br />

de (2(n − 1)) −1 . En particulier, cela contredit une conjecture de Chirikov ([Chi79]) et<br />

Lochak ([Loc92]), et montre que l’instabilité au sens d’Arnold ne peut pas se produire<br />

loin des résonances.<br />

Rappelons que les exemples d’instabilité donnent a < (2(n − 2)) −1 , un exemple plus<br />

fin devrait certainement donner a < (2(n − 1)) −1 et ainsi notre résultat de stabilité<br />

devrait être optimal.<br />

Question 4. Construire un exemple de solution instable dont le temps est donné par<br />

l’exposant (2(n − 1)) −1 .<br />

2.2.2 Cas Gevrey<br />

On suppose maintenant que le système est de classe α-Gevrey, pour α ≥ 1. On a<br />

alors les exposants de stabilité généralisée<br />

a = 1 1<br />

, b =<br />

2αn 2n ,<br />

que l’on peut également améliorer (Théorème 6.2).<br />

Résultat 10. Pour<br />

0 < δ ≤<br />

1<br />

2αn(n − 1) ,


2.3 - Résultats d’instabilité 55<br />

on a stabilité exponentielle avec les exposants<br />

a =<br />

1<br />

− δ, b =<br />

2α(n − 1)<br />

2δ<br />

5(n − 1) .<br />

Les mêmes remarques s’appliquent en ce qui concerne l’exposant de stabilité a, cependant<br />

on obtient un rayon de confinement (exposant b) moins bon. On peut également<br />

formuler la question suivante, qui devrait être beaucoup plus facile dans le cas non analytique<br />

(α > 1).<br />

Question 5. Construire un exemple de solution instable dont le temps est donné par<br />

l’exposant (2α(n − 1)) −1 .<br />

2.3 Résultats d’instabilité<br />

Enfin, terminons par les résultats d’instabilité. Notre but ici est de construire des<br />

exemples, dans différents contextes, où l’on peut montrer l’existence d’une orbite de<br />

dérive et calculer le temps d’instabilité. Pour simplifier les constructions, on va se limiter<br />

aux perturbations de classe Gevrey non analytiques.<br />

2.3.1 Cas a priori instable<br />

Le premier contexte est celui, bien compris, des systèmes a priori instables. Dans le<br />

septième chapitre, on va construire une orbite qui dérive en un temps optimal (Théorème<br />

7.1).<br />

Résultat 11. Pour α > 1, il existe une perturbation α-Gevrey de taille µ d’un système<br />

a priori instable dont le temps de dérive est<br />

τ(µ) ∼ µ −1 ln µ −1 .<br />

La construction de cet exemple utilise des outils dynamiques, en particulier une<br />

généralisation du théorème de Birkhoff-Smale sur la dynamique chaotique engendrée<br />

par l’intersection transverse des variétés stable et instable d’une variété normalement<br />

hyperbolique.<br />

Avec des méthodes différentes, des estimations similaires ont été données par Berti-<br />

Bolle-Biaso ([BBB03]), Treschev ([Tre04]) ainsi que Cresson et Guillet ([CG03]). De plus,<br />

dans [BBB03], les auteurs démontrent également un résultat de stabilité qui implique<br />

l’optimalité de ce temps d’instabilité en classe analytique. Rappelons que notre exemple<br />

n’est pas analytique, et donc ce résultat de stabilité ne s’applique pas à ce cadre.<br />

Question 6. Montrer que le temps d’instabilité µ −1 ln µ −1 est optimal en régularité<br />

Gevrey pour les perturbations de taille µ des systèmes a priori instables à trois degrés<br />

de liberté.<br />

On peut également poser la question suivante.


56 Résultats et questions<br />

Question 7. Pour k suffisamment grand, montrer que le temps d’instabilité µ −1 ln µ −1<br />

est optimal en régularité C k pour les perturbations de taille µ des systèmes a priori<br />

instables à trois degrés de liberté.<br />

Ces résultats sont sûrement vrai, néanmoins cela nécessite d’établir des estimées de<br />

Nekhoroshev pour des hamiltoniens génériques de classe Gevrey ou C k (encore une fois,<br />

le cas Gevrey est plus facile).<br />

2.3.2 Cas non perturbatif<br />

Le second contexte que l’on va aborder est celui des ε-perturbations de systèmes<br />

hamiltoniens intégrables, avec ε arbitrairement petit seulement si le nombre de degrés<br />

de libertés n est arbitrairement grand. Ce type d’exemple a été introduit par Bourgain<br />

et Kaloshin ([BK05]) pour tester la validité du mécanisme de Nekhoroshev en dimension<br />

infinie : à n fixé, la perturbation n’est pas arbitrairement petite et la théorie de<br />

Nekhoroshev ne s’applique donc pas, mais le caractère perturbatif est restauré lorsque<br />

"n tend vers l’infini" (d’où le terme "high-dimensional Hamiltonian systems" introduit<br />

dans [BK05]). Dans ce papier, les auteurs montrent le résultat suivant : si ε ∼ e −n , alors<br />

il existe une orbite qui dérive linéairement, c’est-à-dire τ(ε) ∼ ε −1 .<br />

Dans le huitième chapitre, en exploitant le mécanisme introduit dans [MS02], on va<br />

démontrer plus simplement une variante du résultat précédent (voir Théorème 8.1).<br />

Résultat 12. Il existe un système hamiltonien presque-intégrable, avec<br />

ε ∼ e −n ln(n lnn) ,<br />

qui possède une orbite dont le temps de dérive est polynomial, plus précisément<br />

τ(ε) ∼ ε −n .<br />

Le temps d’instabilité obtenu est moins bon que dans [BK05], car la taille de la<br />

perturbation est plus petite (notre exemple est "plus perturbatif"). En particulier, si on<br />

note ε0 le seuil d’applicabilité du théorème de Nekhoroshev, on en déduit que<br />

ε0


Part II<br />

Results of stability<br />

Summary<br />

3 Generic exponential stability without small divisors 59<br />

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />

3.2 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

3.2.1 Set-up and results . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

3.2.2 Comments and prospects . . . . . . . . . . . . . . . . . . . . 65<br />

3.3 Proof of Theorem 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

3.3.1 Analytical part . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />

3.3.2 Geometric part . . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />

3.A Proof of the normal form . . . . . . . . . . . . . . . . . . . . . . . . . 82<br />

3.A.1 Preliminary estimates . . . . . . . . . . . . . . . . . . . . . . 82<br />

3.A.2 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . . . . 84<br />

3.B SDM functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91<br />

3.B.1 Steepness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />

3.B.2 Prevalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95<br />

4 Generic super-exponential stability for invariant tori 99<br />

4.1 Introduction and main results . . . . . . . . . . . . . . . . . . . . . . 99<br />

4.2 Proof of Theorem 4.1 and Theorem 4.2 . . . . . . . . . . . . . . . . . 104<br />

4.2.1 Birkhoff’s estimates . . . . . . . . . . . . . . . . . . . . . . . 104<br />

4.2.2 Nekhoroshev’s estimates and proof of Theorem 4.2 . . . . . . 105<br />

4.2.3 Proof of Theorem 4.1 . . . . . . . . . . . . . . . . . . . . . . . 107<br />

4.3 Further results and comments . . . . . . . . . . . . . . . . . . . . . . 108<br />

4.A Generic assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 110<br />

5 Polynomial stability for C k quasi-convex Hamiltonian systems 115<br />

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115<br />

5.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />

5.3 Analytical part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118<br />

5.3.1 Elementary estimates . . . . . . . . . . . . . . . . . . . . . . 119<br />

5.3.2 The linear case . . . . . . . . . . . . . . . . . . . . . . . . . . 121<br />

5.3.3 Normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . 124<br />

5.4 Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />

57


3.1 - Introduction 59<br />

3 Generic exponential stability without small divisors<br />

In this chapter, we present a new approach of Nekhoroshev’s theory for a generic unperturbed<br />

Hamiltonian which completely avoids small divisors problems. The proof is an<br />

extension of a method introduced by P. Lochak, it combines averaging along periodic<br />

orbits with simultaneous Diophantine approximation and uses geometric arguments to<br />

handle generic integrable Hamiltonians. This method allows to deal with generic nonanalytic<br />

Hamiltonians and to obtain new results of generic stability around linearly<br />

stable tori. This chapter reproduces the content of [BN09].<br />

3.1 Introduction<br />

3.1.0.1. In this chapter, we are concerned with the stability properties of nearintegrable<br />

analytic Hamiltonian systems. According to a classical theorem of Liouville-<br />

Arnold (see [AKN06]), such systems are locally governed by a Hamiltonian of the form<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| < ε


60 Generic exponential stability without small divisors<br />

to focus on non-resonant tori to prove that a set of large measure of invariant tori<br />

survives under some regularity and non-degeneracy assumptions. This has now become<br />

a rich and vast subject called KAM theory (see [Pös01], [dlL01] or [Bos86] for some<br />

nice introductions on this theory). Such tori persist in a √ ε-neighbourhood of the<br />

unperturbed ones and therefore for a set of large measure of initial conditions, the<br />

variation of the actions is of order √ ε for all time. But on the other hand, this set of<br />

KAM tori is typically a Cantor family (hence with no interior) and the theory gives<br />

no information on the complement, except when n = 2 where these two-dimensional<br />

invariant tori disconnect the three-dimensional energy level leaving all solutions stable<br />

for all time. However for n ≥ 3, it is still possible to find solutions for which the<br />

variation of the action components is of order one. This was proved by Arnold in<br />

his famous paper ([Arn64]) where he proposed a mechanism to produce examples of<br />

near-integrable Hamiltonian systems where such a drift occurs no matter how small the<br />

perturbation is. This phenomenon is usually referred to Arnold diffusion.<br />

3.1.0.4. Hence for n ≥ 3, results of stability for near-integrable Hamiltonian systems<br />

which are valid for an open set of initial conditions can only be proved over finite times.<br />

This picture was completed by Nekhoroshev in the seventies (see [Nek77],[Nek79] and<br />

[Nie09] for a recent overview of the theory) who proved the following: if the system is<br />

analytic and the unperturbed Hamiltonian h satisfies some quantitative transversality<br />

condition called steepness, then there exist positive constants a, b, ε0, c1, c2 and c3 depending<br />

only on h, such that every solution (θ(t), I(t)) of the perturbed system starting<br />

at time t = 0 satisfies<br />

|I(t) − I(0)| ≤ c1ε b , |t| ≤ c2 exp c3ε −a , (6)<br />

provided that the size of the perturbation ε is smaller than the threshold ε0. The<br />

constants a and b are called the stability exponents. If property (6) is satisfied, we<br />

shall say that the integrable Hamiltonian h is exponentially stable. Hence, KAM and<br />

Nekhoroshev’s theory yield different type of stability results, but they both ultimately<br />

rely on the same tool which is the construction of normal forms, and we shall described<br />

it below.<br />

3.1.0.5. The basic idea is to look at a "more integrable" Hamiltonian which yields a<br />

good approximation of the perturbed system. By the averaging principle (see [AKN06]),<br />

this simpler Hamiltonian is given by the time average of the system along the unperturbed<br />

flow, that is<br />

[H] = h + [f],<br />

where<br />

t<br />

1<br />

[f] = lim f ◦ Φ<br />

t→∞ t 0<br />

h sds <br />

,<br />

and Φ h s is the Hamiltonian flow of the integrable part h. Actually, this average depends<br />

on the dynamics of the unperturbed Hamiltonian and hence on resonant modules associated<br />

to frequencies. So given a submodule M ⊆ Z n , we define its resonant manifold<br />

by<br />

SM = {I ∈ R n | k.∇h(I) = 0 for k ∈ M}.<br />

Due to the ergodic properties of the linear flow with vector ∇h(I) over the torus T n ,<br />

the time average over SM equals the space average along a torus of dimension n − m if


3.1 - Introduction 61<br />

m is the multiplicity of the resonance (i.e. the rank of M), hence n − m angles have<br />

been removed in this case. From a physical point of view, the guiding principle is that<br />

rapidly oscillating terms discarded in averaging cause only small oscillations which are<br />

superimposed to the solutions of the averaged system. In order to prove this claim, one<br />

should check that any solution of the perturbed system remains close to the solution of<br />

the averaged system with the same initial condition. Especially, this will be the case if<br />

one finds a canonical transformation ε-close to identity which conjugates the perturbed<br />

Hamiltonian to its average. Hence we are reduced to a problem of normal form where<br />

one tries to conjugate the system to a simpler one, that is we look for a convenient<br />

system of coordinates.<br />

However, constructing such a good system of coordinates is not an easy task. The<br />

linearised equation of conjugation reads<br />

{χ, h} = f − [f],<br />

if χ is the function generating the conjugation. This is usually called a homological<br />

equation and to solve it we need to invert the linear operator Lh = {., h} acting on a<br />

suitable space of functions. Here our operator is invertible, but its inverse is generally<br />

unbounded: this is the small divisors phenomenon. To see this, just note that once an<br />

action I ∈ SM is fixed (and hence a frequency ω = ∇h(I) satisfying k.ω = 0 for k /∈ M),<br />

the homological equation is a just a first-order, linear with constant coefficients partial<br />

differential equation on T n , namely<br />

ω.∇χ = f − [f].<br />

Such equations are known to be well-suited for Fourier analysis, in our case the operator<br />

Lh is easily diagonalized in a Fourier basis and we find that the eigenvalues are<br />

proportional to the scalar products k.ω, for k ∈ Zn . More precisely, expanding χ and f<br />

as<br />

χ(θ) = <br />

ˆχke i2πk.θ , f(θ) = <br />

ˆfke i2πk.θ ,<br />

then<br />

and so formally<br />

k∈Z n<br />

ˆχk =<br />

[f] = <br />

k∈M<br />

ˆfke i2πk.θ ,<br />

k∈Z n<br />

<br />

(i2πk.ω) −1 ˆ fk, k /∈ M,<br />

0, k ∈ M.<br />

The scalar products k.ω appearing in the denominators of (7) are not zero by assumption,<br />

but they can be arbitrarily small and this is inevitable for large integers k. This can<br />

cause the divergence of the Fourier series of χ and hence the unboundedness of the<br />

inverse of Lh. Classical small divisors techniques are concerned with obtaining lower<br />

bounds for these scalar products to ensure the convergence of the series and this leads<br />

necessarily to complicated estimates. Furthermore, to obtain a result applying to all<br />

solutions, a partition of the phase space into resonant manifolds associated to different<br />

modules, usually called the geometry of resonances, has to be achieved and this is a<br />

delicate task. All these techniques are very important, in particular to study Arnold<br />

diffusion and related problems, however we will show that they are not necessary to<br />

prove Nekhoroshev’s estimates.<br />

(7)


62 Generic exponential stability without small divisors<br />

3.1.0.6. Indeed, all these problems are completely bypassed if we only average along<br />

periodic orbits of the unperturbed flow. We first recall the following definition.<br />

Definition 3.1. A vector ω ∈ R n is said to be periodic if there exists a real number<br />

t > 0 such that tω ∈ Z n . In this case, the number<br />

is called the period of ω.<br />

T = inf{t > 0 | tω ∈ Z n }<br />

A basic example is given by a vector with rational components, the period of which is<br />

just the least common multiple of the denominators of its components. Geometrically,<br />

if ω is T-periodic, an invariant torus with a linear flow with vector ω is filled with<br />

T-periodic orbits. In this case, the average along such a periodic solution is given by<br />

<br />

1<br />

[f] = lim<br />

t→∞ t<br />

t<br />

0<br />

f ◦ Φ l sds <br />

= 1<br />

T<br />

f ◦ Φ<br />

T 0<br />

l sds, where l denotes the linear Hamiltonian with frequency ω, that is l(I) = ω.I. Then the<br />

homological equation {χ, l} = f − [f] is easily solved without using Fourier expansions<br />

and is given by an explicit integral formula<br />

χ = 1<br />

T<br />

T<br />

0<br />

(f − [f]) ◦ Φ l ssds.<br />

So in this case, there is no small divisors. To understand more concretely the previous<br />

sentence, consider a vector ω ∈ R n and multi-integers k that do not resonate with ω<br />

(that is k /∈ Z n ∩ ω ⊥ ), then in general we don’t have a lower bound on the divisors k.ω<br />

that appears in (7). In that context, small divisors techniques use Diophantine vectors<br />

for which |k.ω| ≥ γ|k| −τ<br />

1 , with γ > 0, τ ≥ 0 and where | . |1 stands for the ℓ 1 -norm,<br />

but nevertheless the lower bound deteriorates as |k|1 increases, causing extra difficulties<br />

(which are usually handled by the so-called ultra-violet cut-off). However if the vector<br />

ω is T-periodic, one simply has |k.ω| ≥ T −1 and the lower bound is uniform in |k|1.<br />

3.1.0.7. Lochak ([Loc92], see also [LN92] and [LNN94] for refinements) has shown that<br />

averaging along the periodic orbits of the integrable Hamiltonian is enough to obtain<br />

Nekhoroshev’s estimates of stability when the unperturbed Hamiltonian is strictly convex<br />

(or strictly quasi-convex, that is its energy sub-levels are strictly convex). Indeed,<br />

using convexity, Lochak obtains open sets around periodic orbits over which exponential<br />

stability holds. Then, Dirichlet’s theorem about simultaneous Diophantine approximation<br />

ensures easily that these open sets recover the whole action space and yields the<br />

global result, avoiding the difficult geometry of resonances. Put it differently, in the<br />

convex case one only needs dynamical informations near resonances of maximal multiplicities,<br />

which are completely characterized by periodic orbits.<br />

The goal of this chapter is to extend Lochak’s approach for a generic set of integrable<br />

Hamiltonians. To do so, we will have to analyze the dynamics in a neighbourhood of<br />

suitable resonances of any multiplicities by using only successive averaging along periodic<br />

orbits together with Dirichlet’s theorem, and this will lead to exponential estimates of<br />

stability for perturbation of a generic integrable Hamiltonian, as stated below.


3.2 - Statement of results 63<br />

Theorem 3.2. Consider an arbitrary real analytic integrable Hamiltonian h defined on a<br />

neighbourhood of a closed ball in R n . For almost any ξ ∈ R n , the integrable Hamiltonian<br />

hξ(x) = h(I) − ξ.I is exponentially stable with the exponents a = b = 3 −1 (2n) −3n .<br />

This will be a direct consequence of Theorems 3.4 and 3.6, see below in section 3.2.1.<br />

This result is not new, see [Nie07b], but the novelty here is our method of proof, which<br />

avoids completely the fundamental problem of small divisors and hence all the associated<br />

technicalities (non-resonant domains, Fourier series, Fourier norm, ultra-violet cut-off<br />

and so on). The analytic part of our proof of Nekhoroshev’s estimates is therefore<br />

reduced to its bare minimum, it is nothing but a classical one-phase averaging, while<br />

our geometric part is based on a clever use of Dirichlet’s theorem along each solution.<br />

Applications of our method to other problems will be discussed below, in section 3.2.2.<br />

To conclude this introduction, we point out that the method of averaging along<br />

periodic orbits has also been used successfully to re-prove recently some KAM theorems<br />

without small divisors (see [KLDM06] and [KLDM07]), even though their techniques<br />

are much more complicated.<br />

3.2 Statement of results<br />

3.2.1 Set-up and results<br />

3.2.1.1. Let B = BR be the open ball centered at the origin of R n of radius R with<br />

respect to the supremum norm, the domain D = T n ×B will be our phase space. To avoid<br />

trivial situations, we assume n ≥ 2. Our Hamiltonian function H is real-analytic and<br />

bounded on D and it admits a holomorphic extension to some complex neighbourhood<br />

of D of the form<br />

Dr,s = {(θ, I) ∈ (C n /Z n ) × C n | |I(θ)| < s, d(I, B) < r},<br />

with two fixed numbers r > 0, s > 0, and where I(θ) is the imaginary part of θ, | . | the<br />

supremum norm on C n and d the associated distance on C n . Equivalently, one can start<br />

with a Hamiltonian H, defined and holomorphic on Dr,s and which preserves reality,<br />

that is H is real-valued for real arguments. Without loss of generality, we may assume<br />

that r < 1 and s < 1. The space of such analytic functions on Dr,s, endowed with the<br />

supremum norm | . |r,s, is obviously a Banach algebra with respect to the multiplication<br />

of functions, and we shall denote it by Ar,s.<br />

Our Hamiltonian H ∈ Ar,s is assumed to be close to integrable, that is of the form<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

(∗)<br />

|f|r,s < ε 1, that is<br />

where |k|1 = |k1| + · · · + |kn|.<br />

|∂ k h(I)| ≤ M, 1 ≤ |k|1 ≤ 3, I ∈ B,


64 Generic exponential stability without small divisors<br />

3.2.1.2. In order to obtain results of exponential stability, we do need to impose some<br />

non-degeneracy condition on the unperturbed Hamiltonian. Let G(n, k) be the set of all<br />

vector subspaces of R n of dimension k. We endow R n with the Euclidean scalar product,<br />

. stands for the Euclidean norm, and given an integer L ∈ N ∗ , we define G L (n, k) as<br />

the subset of G(n, k) consisting of those subspaces whose orthogonal complement can<br />

be spanned by vectors k ∈ Z n with |k|1 ≤ L.<br />

Definition 3.3. A function h ∈ C 2 (B) is said to be SDM if there exist γ > 0 and<br />

τ ≥ 0 such that for any L ∈ N ∗ , any k ∈ {1, . . ., n} and any Λ ∈ G L (n, k), there exists<br />

(e1, . . .,ek) (resp. (f1, . . .,fn−k)), an orthonormal basis of Λ (resp. of Λ ⊥ ), such that<br />

the function hΛ defined on B by<br />

hΛ(u, v) = h (u1e1 + · · · + ukek + v1f1 + · · · + vn−kfn−k),<br />

satisfies the following: for any (u, v) ∈ B,<br />

for any η ∈ R k \ {0}.<br />

∂uhΛ(u, v) ≤ γL −τ =⇒ ∂uuhΛ(u, v).η > γL −τ η<br />

In other words, for any (u, v) ∈ B, we have the following alternative: either<br />

∂uhΛ(u, v) > γL −τ or ∂uuhΛ(u, v).η > γL −τ η for any η ∈ R k \ {0}. This technical<br />

definition, which is a slight variation of a notion introduced in [Nie07b], is basically a<br />

quantitative transversality condition which is stated in adapted coordinates. It is inspired<br />

on the one hand by the steepness condition introduced by Nekhoroshev ([Nek77])<br />

where one has to look at the projection of the gradient map ∇h onto affine subspaces,<br />

and on the other hand by the quantitative Morse-Sard theory of Yomdin ([Yom83],<br />

[YC04]) where critical or "nearly-critical" points of h have to be quantitatively non<br />

degenerate. The abbreviation SDM stands for "Simultaneous Diophantine Morse" functions,<br />

and we refer to Appendix 3.B for more explanations on this condition and some<br />

justifications on the latter terminology.<br />

3.2.1.3. The set of SDM functions on B with respect to γ > 0 and τ ≥ 0 will be<br />

denoted by SDM τ γ (B), and we will also use the notations<br />

SDM τ (B) = <br />

SDM<br />

γ>0<br />

τ γ (B),<br />

<br />

SDM(B) = SDM<br />

τ≥0<br />

τ (B).<br />

The following result states that SDM functions are generic among sufficiently smooth<br />

functions.<br />

Theorem 3.4. Let τ > 2(n 2 + 1) and h ∈ C 2n+2 (B). For Lebesgue almost all ξ ∈ R n ,<br />

the function hξ(I) = h(I) − ξ.I belongs to SDM τ (B).<br />

More precisely, there is a good notion of "full measure" in an infinite dimensional<br />

vector space, which is called prevalence (see [KH08] and [OY05] for nice surveys), and<br />

the previous theorem immediately gives the following result.<br />

Corollary 3.5. For τ > 2(n 2 + 1), SDM τ (B) is prevalent in C 2n+2 (B).<br />

3.2.1.4. Now we can state the main result of this chapter.


3.2 - Statement of results 65<br />

Theorem 3.6. Let H as in (∗) and assume that the integrable part h belongs to<br />

SDM τ γ (B) with τ ≥ 2 and γ ≤ 1. Then there exist positive constants a, b and ε0<br />

depending only on h such that if ε ≤ ε0, for every initial action I(0) ∈ BR/2 the following<br />

estimates<br />

|I(t) − I(0)| < (n 2 + 1)ε b , |t| < exp(ε −a ),<br />

hold true.<br />

More precisely, we can choose the exponents<br />

a = b = 3 −1 (2(n + 1)τ) −n ,<br />

and ε0 depending on the whole set of parameters n, R, r, s, M, γ and τ, but no efforts<br />

was made to improved the stability exponents since the optimality of the constants<br />

involved is not our goal. Actually, this optimality is not relevant for generic integrable<br />

Hamiltonians.<br />

Let us add that the only property used on the integrable part h to derive these<br />

estimates is a specific steepness property, therefore the proof is also valid, and in fact<br />

simpler, assuming the original steepness condition of Nekhoroshev (see Appendix 3.B).<br />

However, note that this is precisely this "weaker" genericity assumption that allows new<br />

results of stability near linearly stable invariant tori (see the next chapter).<br />

We emphasized again that this is not the result itself, but the method of proof which<br />

is new and leads to many improvements as we explain below.<br />

3.2.2 Comments and prospects<br />

To conclude this section we mention other problems for which our method should apply,<br />

mainly the study of elliptic fixed points, Nekhoroshev’s estimates in lower regularity<br />

and finally estimates in large or infinite dimensional Hamiltonian systems. In all these<br />

topics, the method of periodic averaging have already proved to be very useful.<br />

3.2.2.1. First our analytic arguments are very intrinsic and this is important in the<br />

study of the stability of elliptic fixed points in Hamiltonian systems. Actually, in this<br />

case the transformation in action-angle variables (via the symplectic polar coordinates)<br />

admits singularities which do not allow to derive directly stability results from Nekhoroshev’s<br />

theory. In the convex case, this problem has been overcomed independently by<br />

Fassò, Guzzo and Benettin([FGB98], [GFB98]) and by Niederman ([Nie98]). Both use<br />

Cartesian coordinates, the first study uses the classical approach and adapted Fourier expansions<br />

while the second one relies on periodic averaging and simultaneous Diophantine<br />

approximation. The latter proof was clarified by Pöschel ([Pös99b]). With our approach,<br />

we can remove the convexity hypo<strong>thesis</strong> to have exponential stability around an elliptic<br />

fixed point under a generic assumption on the non-linear part. Furthermore, assuming<br />

a Diophantine condition on the normal frequency it is well-known since Morbidelli and<br />

Giorgilli ([MG95]) that one can even obtain super-exponential stability by combining a<br />

sufficiently large number of Birkhoff normalizations with Nekhoroshev’s estimates. Here,<br />

with our method generic results of super-exponential stability around elliptic fixed points<br />

are also available, and similarly around invariant Diophantine Lagrangian tori and even<br />

isotropic reducible linearly stable tori. All this results are contained in the next chapter.


66 Generic exponential stability without small divisors<br />

3.2.2.2. Furthermore, one should mention that periodic averaging are well-suited for<br />

non-analytic Hamiltonians and our formalism should also carry on in this context. The<br />

advantage of periodic averaging is clear already at the linear level when solving the<br />

homological equation: if the system is of finite differentiability, then the solution of the<br />

homological equation when expanded in Fourier series is subjected to a disastrous loss<br />

of derivatives (larger than the number of degrees of freedom), while with the explicit<br />

integral formula, this loss of derivatives is minimal. Hence we can expect stability estimates<br />

in finite differentiability, for both convex and generic unperturbed Hamiltonian,<br />

but of course with polynomial bound on the time of stability. Note that the analyticity<br />

of the studied system is only needed for the construction of normal forms up to an exponentially<br />

small remainder, but our steepness condition is generic for Hamiltonians of<br />

finite but sufficiently high regularity. Concerning Gevrey regularity, Marco and Sauzin<br />

([MS02]) have already proved exponential estimates of stability in the convex case and<br />

for the C k regularity, polynomial estimates of stability will be proved in the fifth chapter.<br />

With our method these results should also hold for a generic integrable Hamiltonian. It<br />

can also be noticed that the analytical properties of the expansions arising in periodic<br />

averaging are accurately known ([Nei84],[RS96])<br />

3.2.2.3. Finally, results of stability for large Hamiltonian systems as a model for statistical<br />

mechanics have been obtained by Bambusi and Giorgilli ([BG93]) and Bourgain<br />

([Bou04]), and for non-linear evolution PDE seen as an infinite dimensional Hamiltonian<br />

system mostly by Bambusi ([Bam99], [BN02]) and then clarified by Pöschel ([Pös99a]).<br />

All these works use Lochak’s approach in the convex case. We believe that our method<br />

should allow to remove the convexity assumption in those results to obtain more general<br />

statements. Actually, our approach rely on a property of composition of averaging<br />

transformations which was already used by Bambusi ([Bam99]).<br />

3.2.2.4. This chapter is organized as follows. In the next section, we state our normal<br />

form and explain the main ideas, and then we give the proof of Theorem 3.6. The<br />

complete proof of the normal form is deferred to Appendix 3.A, and in Appendix 3.B<br />

we collect the basic properties of SDM functions that we shall need and we prove Theorem<br />

3.4 and Corollary 3.5.<br />

3.2.2.5. In the text, we shall adopt the following notation taken from [Pös99b]: we<br />

will write u


3.3 - Proof of Theorem 3.6 67<br />

with H ∈ Ar,s. As usual, the proof of exponential stability estimates splits into an<br />

analytic part and a geometric part.<br />

The analytic part is contained in section 3.3.1. It consists in the construction of normal<br />

forms on a neighbourhood of specific resonances, that is suitable coordinates which<br />

display the relevant part of the perturbation on such a neighbourhood. Basically, we will<br />

reduce the perturbation to a so-called resonant term which is dynamically significant,<br />

and a general term which will only cause exponentially small deviations.<br />

The geometric part is expanded in section 3.3.2, and it is mainly based on the<br />

properties of the underlying integrable system. The strategy will be first to defined a<br />

class of solutions, which we call restrained, and for which it is obvious from our normal<br />

forms that they are stable for an exponentially long time. Using this intermediate result,<br />

we will then show that all solutions are in fact exponentially stable, and our main tools<br />

to do this will be an adapted steepness property satisfied by our integrable system, as<br />

well as a basic theorem of Dirichlet on simultaneous Diophantine approximation.<br />

3.3.1 Analytical part<br />

3.3.1.1. Let us begin by describing the neighbourhoods of resonances we will consider.<br />

Given a sequence of linearly independent periodic vectors (ω1, . . ., ωn), with periods<br />

(T1, . . .,Tn), we define in the complex phase space, for j ∈ {1, . . .,n}, the domains<br />

Drj,sj (ωj) = {(θ, I) ∈ Drj,sj | |∇h(I) − ωj|


68 Generic exponential stability without small divisors<br />

We will write lj for the linear integrable Hamiltonian with frequency ωj, that is<br />

lj(I) = ωj.I for j ∈ {1, . . .,n}. For any function f, we will denote [f]j its average along<br />

the periodic flow generated by lj, that is<br />

[f]j = 1<br />

Tj<br />

Tj<br />

0<br />

f ◦ Φ lj<br />

s ds.<br />

3.3.1.3. Our interest here is to obtain normal forms on nearly-periodic tori up to an<br />

exponentially small remainder with respect to some parameter m ∈ N, that we will<br />

choose later of order ε −1 (during the proof of Theorem 3.6). To this end, we will need<br />

the following conditions (Aj), for j ∈ {1, . . .,n}, where (A1) is<br />

<br />

and for j ∈ {2, . . ., n}, (Aj) is<br />

mT1ε ·


3.3 - Proof of Theorem 3.6 69<br />

with {gj, li} = 0 for i ∈ {1, . . ., j} and the estimates<br />

Moreover, we have Ψj = Φ1 ◦ · · · ◦ Φj with<br />

|∂θgj|2rj/3,2sj/3


70 Generic exponential stability without small divisors<br />

Remark 3.8. Note that this property was actually used by Bambusi ([Bam99], Lemma<br />

8.4).<br />

3.3.1.5. Let us now examine the dynamical consequences of our normal form. As usual,<br />

it will be used to control the directions, if any, in which the action variables in these<br />

new coordinates can actually drift, and we shall come back to our original coordinates<br />

at the beginning of section 3.3.2.<br />

Under the assumptions of Proposition 3.1, consider the Hamiltonian<br />

Hj = H ◦ Ψj = h + gj + fj<br />

on the domain D2rj/3,2sj/3(ωj). Let Mj be the Z-module<br />

Mj = {k ∈ Z n | k.ωi = 0, i ∈ {1, . . ., j}},<br />

whose rank is n − j, and Λj = Mj ⊗ R the vector space spanned by Mj.<br />

The following lemma is completely obvious using the definition of the Poisson<br />

bracket.<br />

Lemma 3.9. The equality {gj, li} = 0, for all i ∈ {1, . . ., j}, is equivalent to ∂θgj ∈ Λj.<br />

Now consider a solution (θ j (t), I j (t)) of Hj with an initial action I j (tj) ∈ B2rj/3(ωj)<br />

for some tj ∈ R, and define the time of escape of this solution as the smallest time<br />

˜tj ∈]tj, +∞] for which I j (˜tj) /∈ B2rj/3(ωj). The only information we shall use from our<br />

normal form is contained in the next proposition.<br />

Proposition 3.2. Let Πj be the projection onto the linear subspace Λj, then with the<br />

previous notations, we have<br />

In particular,<br />

|I j (t) − I j (tj) − Πj(I j (t) − I j (tj))|


3.3 - Proof of Theorem 3.6 71<br />

and therefore<br />

for t ∈ [tj, e m [∩[tj, ˜tj[.<br />

|I j (t) − I j (tj) − Πj(I j (t) − I j (tj))|


72 Generic exponential stability without small divisors<br />

Given a solution (θ(t), I(t)) ∈ B starting at time t0 = 0, we can define inductively<br />

the "averaged" solution (θ i (t), I i (t)) for i ∈ {1, . . .,n} by<br />

Φi(θ i (t), I i (t)) = (θ i−1 (t), I i−1 (t))<br />

as long as I i−1 (t) ∈ Bi, with (θ 0 (t), I 0 (t)) = (θ(t), I(t)). Moreover, using our estimate<br />

on Φi we have<br />

|I i (t) − I i−1 (t)| · 0 and m ∈ N, a solution (θ(t), I(t)) of the Hamiltonian<br />

(∗), starting at time t0 = 0, is said to be restrained (by r0, up to time e m ) if we<br />

can find sequences of:<br />

(1) radii (r1, . . .,rn), with 0 < rn < · · · < r1 < r0;<br />

(2) widths (s1, . . .,sn), with 0 < sn < · · · < s1;<br />

(3) independent periodic vectors (ω1, . . .,ωn), with periods (T1, . . .,Tn);<br />

(4) times (t1, . . .,tn), with 0 = t0 ≤ t1 ≤ · · · ≤ tn ≤ tn+1 = e m ,<br />

satisfying, for j ∈ {0, . . ., n − 1}, conditions (Aj+1) and the following conditions (Bj)<br />

defined by <br />

|Ij (t) − Ij (tj)| < rj, t ∈ [tj, tj+1],<br />

|∇h(Ij (tj+1)) − ωj+1| < rj+1.<br />

(Bj)<br />

Before explaining this definition, we need to make several remarks. First, for j ∈<br />

{0, . . .,n−2} we will see that the first condition of (Bj+1) is well defined by the second<br />

condition of (Bj). Furthermore, for j ∈ {0, . . .,n−1} the last condition in (Bj) implies<br />

in particular that the set Bj+1(ωj+1) is non-empty so we may remove this assumption<br />

from (Aj+1). Finally, we can choose the same sequence of widths (s1, . . .,sn) for all<br />

solutions, therefore we may already fix si ·=s with a suitable positive constant and this<br />

simplifies some conditions (for instance, the condition mTjrj ·< sj appearing in (Aj) will<br />

be replaced by mTjrj ·< 1).<br />

We have chosen the word "restrained" because for such a solution the actions I(t)<br />

(or some properly normalized actions I j (t)) are forced to pass close to a resonance at the<br />

time t = tj, the multiplicity of which decreases as j increases, and moreover the variation<br />

of these (normalized) actions is controlled on each time interval [tj, tj+1]. Hence after<br />

the time tn, the actions are in a domain free of resonances and they are easily confined<br />

in view of the last part of Proposition 3.2. This is reminiscent of the original mechanism<br />

of Nekhoroshev, but the fact that we consider each solution individually will greatly<br />

simplify this geometric part.<br />

3.3.2.3. Let us see how the actions of a restrained solution are easily confined for an<br />

exponentially long time with respect to m. We shall write<br />

for j ∈ {1, . . ., n}.<br />

ρj = r1 + · · · + rj,


3.3 - Proof of Theorem 3.6 73<br />

Proposition 3.3. Consider a restrained solution (θ(t), I(t)), with an initial action<br />

I(0) ∈ BR/2. If<br />

(i) ε ·


74 Generic exponential stability without small divisors<br />

3.3.2.4. Restrained solutions are exponentially stable, and now we will show that this<br />

is in fact true for all solutions. However, to use our steepness arguments this will be<br />

done quite indirectly, and so it is useful to introduce the following definition.<br />

Definition 3.11. Given r0 > 0 and m ∈ N, a solution (θ(t), I(t)) of the Hamiltonian<br />

(∗), starting at time t0 = 0, is said to be drifting (to r0, before time e m ) if there<br />

exists a time t∗ satisfying<br />

|I(t∗) − I(0)| = (n 2 + 1)r0, 0 < t∗ < e m .<br />

Of course, this definition makes sense only if (n 2 + 1)r0 < R/2. In view of Proposition<br />

3.3, drifting solutions cannot be restrained. However, we will prove below that if<br />

such a drifting solution exists, it has to be restrained under some assumptions on r0, m<br />

and ε, which will eventually prove that all solutions are in fact exponentially stable.<br />

More precisely, assuming the existence of a drifting solution, we will construct a<br />

sequence of radii (r1, . . .,rn), an increasing sequence of times (t1, . . .,tn) and a sequence<br />

of linearly independent vectors (ω1, . . ., ωn), with periods (T1, . . .,Tn) satisfying, for<br />

j ∈ {0, . . ., n−1}, assumptions (Aj+1) and (Bj). All sequences will be built inductively,<br />

and we first describe the tools that we shall need.<br />

3.3.2.5. For j ∈ {1, . . .,n}, recall that Λj is the vector space spanned by<br />

Mj = {k ∈ Z n | k.ωi = 0, i ∈ {1, . . ., j}},<br />

and that Πj (resp. Π ⊥ j ) is the projection onto Λj (resp. Λ ⊥ j ). Let us define the integer<br />

Lj = sup {|Tiωi|} ∈ N<br />

i∈{1,...,j}<br />

∗ , j ∈ {1, . . ., n − 1}.<br />

For completeness, we set Λ0 = Rn , L0 = 1 and in this case Π0 is nothing but the<br />

identity. To construct the sequence of times, we will rely on the fact that our integrable<br />

part h belongs to SDM τ γ (B), so that it satisfies the following steepness property (see<br />

Appendix 3.B).<br />

Lemma 3.12. For j ∈ {0, . . .,n − 1}, let λj be any affine subspace with direction Λj,<br />

and take r < 1. Then for any continuous curve Γ from [0, 1] to Λj ∩ B with length<br />

|Γ(0) − Γ(1)| = r ·r 2 .<br />

Proof. For any j ∈ {1, . . ., n − 1}, the orthogonal complement of Λj is spanned by<br />

ω1, . . .,ωj, hence by the integer vectors T1ω1, . . .,Tjωj, so that Λj belongs to G Lj (n, n−j)<br />

with the integer Lj defined above. Therefore one can apply the Proposition 3.7 in<br />

Appendix 3.B to get the required properties (note that here we are using the supremum<br />

norm instead of the Euclidean norm, so the implicit constants are different).<br />

For j = 0, the curve Γ goes from [0, 1] to B = B ∩ R n , but since the orthogonal<br />

complement of R n is trivial one can take L0 = 1.


3.3 - Proof of Theorem 3.6 75<br />

3.3.2.6. To construct the sequence of periodic vectors, we shall use the following<br />

lemma, which is a straightforward application of Dirichlet’s theorem on simultaneous<br />

Diophantine approximation (see [Cas57]).<br />

Lemma 3.13. Given any vector v ∈ R n and any real number Q > 0, there exists a<br />

T-periodic vector ω satisfying<br />

|v − ω| ≤ T −1 1<br />

−<br />

Q n−1, |v| −1 ≤ T ≤ Q|v| −1 .<br />

Proof. Fix any real number Q > 0. We can write the vector v, up to re-ordering its<br />

components, as v = |v|(±1, x) with x ∈ R n−1 , and it will be enough to approximate x by<br />

a periodic vector. By a theorem of Dirichlet, we can find an integer q, with 1 ≤ q < Q,<br />

such that<br />

1<br />

−<br />

|qx − p| ≤ Q n−1,<br />

for some p ∈ Z n . The vector q −1 p is trivially q-periodic, hence the vector ω =<br />

|v|(±1, q −1 p) is T-periodic, with T = |v| −1 q, therefore<br />

and we have the estimate<br />

|v| −1 ≤ T ≤ Q|v| −1 ,<br />

|v − ω| ≤ T −1 |qx − p| ≤ T −1 1<br />

−<br />

Q n−1.<br />

3.3.2.7. Now we can finally prove that drifting solutions are in fact restrained under<br />

some assumptions. This will be done inductively, and for technical reasons we separate<br />

the first step (Proposition 3.4) from the general inductive step (Proposition 3.5).<br />

Proposition 3.4. Let (θ(t), I(t)) be a drifting solution. If<br />

(i) r0 ·


76 Generic exponential stability without small divisors<br />

Now using the fact that h ∈ SDM τ γ (B) and r0 ·< γ (recall that L0 = 1), we can apply<br />

Lemma 3.12 (the case j = 0) to the curve Γ1 restricted to [0, t ∗ 0] to find a time t1 ∈ [0, t ∗ 0]<br />

for which <br />

|I(t) − I(0)| < r0, t ∈ [0, t1],<br />

|∇h(I(t1))| ·>r 2 0.<br />

The first inequality of (12) gives (a).<br />

Now choose Q1 = ε −a1(n−1) , for some positive constant a1 yet to be chosen, and apply<br />

Lemma 3.13 to approximate ∇h(I(t1)) by a T1-periodic vector ω1, that is<br />

Moreover, since<br />

|∇h(I(t1)) − ω1| ≤ T −1<br />

1 Q<br />

1<br />

− n−1<br />

1<br />

r 2 0


3.3 - Proof of Theorem 3.6 77<br />

Proof. First note that for j = 1, we do not require that t1, ω1 and r1 satisfy (A1) since<br />

this is implied by the conditions (i), (ii) and (iii), and for j > 1, the same conditions<br />

reduce assumption (Aj+1) to the inclusion of real domains Brj+1 (ωj+1) ⊆ B2rj/3(ωj)<br />

(recall that by condition (Bj−1) these domains are non-empty, and that we have already<br />

fixed sj ·= s).<br />

Therefore, we need to construct tj+1, ωj+1 and rj+1 satisfying<br />

(a) |I j (t) − I j (tj)| < rj, t ∈ [tj, tj+1];<br />

(b) |∇h(I j (tj+1)) − ωj+1| < rj+1;<br />

(c) ωj+1 is independent of (ω1, . . .,ωj);<br />

(d) Brj+1 (ωj+1) ⊆ B2rj/3(ωj),<br />

and the estimates (14).<br />

Let ˜tj be the maximal time of existence within Bj of the solution I j (t) starting at<br />

I j (tj). Since (Aj) is satisfied, we can apply Proposition 3.2 and for t ∈ [tj, ˜tj] ∩ [tj, e m ],<br />

we have<br />

|I j (t) − I j (tj) − Πj(I j (t) − I j (tj))| ·rj. (16)


78 Generic exponential stability without small divisors<br />

But conditions (v) and (vi) give in particular<br />

so that (15) and (16) yields<br />

ε ··rj.<br />

Now using (v) again, this gives<br />

|Πj(I j (˜tj) − I j (tj))| >· TjrjL −1<br />

τ j ,<br />

and so we can certainly find a time t ∗ j ∈ [tj, ˜tj] such that<br />

|Πj(I j (t ∗ j ) − Ij (tj))| = TjrjL −1<br />

τ j .<br />

Second case: ˜tj > e m . We will first prove that t∗ ∈ [tj, e m ]. Indeed, otherwise t∗<br />

belongs to [tk, tk+1] for some k ∈ {0, . . ., j − 1} and we can write<br />

k−1<br />

|I(t∗) − I(0)| ≤ |I(t∗) − I(tk)| + |I(ti+1) − I(ti)|. (17)<br />

Each term of the right-hand side of (17) is easily estimated: using (Bi) for i ∈ {0, . . ., k−<br />

1} we have<br />

|I i (ti+1) − I i (ti)| < ri, |I k (t∗) − I k (tk)| < rk,<br />

which implies, by the triangle inequality and the estimate (8)<br />

Moreover,<br />

hence we find<br />

|I(ti+1) − I(ti)| < 2ρi + ri, |I(t∗) − I(tk)| < 2ρk + rk.<br />

|I(t∗) − I(0)| <<br />

i=0<br />

|I(t1) − I(0)| < r0,<br />

k<br />

i=1<br />

(2ρi + ri) + r0 < (n 2 + 1)r0,<br />

which of course contradicts the definition of our drifting time t∗.<br />

Now to prove the claim, we argue by contradiction and suppose that<br />

|Πj(I j (t) − I j (tj))| < TjrjL −1<br />

τ j , t ∈ [tj, e m ].<br />

Since t∗ ∈ [tj, e m ], we can use the previous inequality together with the estimate (15)<br />

and both conditions (v) and (vi) to first obtain<br />

and then with the triangle inequality<br />

|I j (t∗) − I j (tj)| < rj,<br />

|I(t∗) − I(tj)| < 2ρj + rj.


3.3 - Proof of Theorem 3.6 79<br />

Now, as the argument above, writing<br />

j−1 <br />

|I(t∗) − I(0)| ≤ |I(t∗) − I(tj)| + |I(ti+1) − I(ti)|<br />

we find the same contradiction on the time t∗, which completes the proof of the claim.<br />

Now consider the restriction of the curve Γj+1 on the interval [tj, t∗ j]. Using our<br />

claim together with conditions (iv) and (v), we can apply Lemma 3.12 to find a time<br />

tj+1 ∈ [tj, t∗ j ] such that<br />

<br />

|Πj(I j (t) − Ij (tj))| < TjrjL −1τ<br />

j , t ∈ [tj, tj+1],<br />

|Πj(∇h(Γj+1(tj+1)))| ·> TjrjL −1<br />

2τ (18)<br />

j .<br />

The first inequality of (18), together with (15) and conditions (v) and (vi) give<br />

i=0<br />

|I j (t) − I j (tj)| < rj<br />

for t ∈ [tj, tj+1], hence (a) is verified. Now as in the first step, choose Qj+1 = ε −aj+1(n−1)<br />

for some positive constant aj+1 to be chosen later, and apply Lemma 3.13 to approximate<br />

∇h(I j (tj+1)) by a Tj+1-periodic vector ωj+1, that is<br />

|∇h(I j (tj+1)) − ωj+1| ≤ T −1<br />

1<br />

n−1<br />

j+1Q− j+1<br />

= T −1<br />

j+1 εaj+1 . (19)<br />

Let rj+1 =·T −1<br />

j+1 εaj+1 so that (b) is verified by (19). To estimate the period Tj+1 and the<br />

number Lj, we need a lower bound for |∇h(I j (tj+1))| and we will use the fact that we<br />

have such a lower bound for |∇h(I(t1)| (see the second inequality of (12)). First note<br />

that one has easily<br />

|I j (tj+1) − I(t1)|


80 Generic exponential stability without small divisors<br />

The estimates (21) and (23) gives (14) (note that we have a similar estimate for Lj+1,<br />

however at the end we shall only need estimates for L1, . . .,Ln−1).<br />

Next having built rj+1, we need to check that ωj+1 is independent of (ω1, . . .,ωj).<br />

First, by using the mean value theorem, the estimate (15) and our condition (vi), we<br />

have<br />

|∇h(I j (tj+1)) − ∇h(Γj+1(tj+1))| ·< TjrjL −12τ<br />

j ,<br />

and together with the second estimate of (18), this gives<br />

Furthermore, using (19)<br />

hence with (viii), we get<br />

|Πj(∇h(I j (tj+1)))| ·> TjrjL −12τ<br />

j . (24)<br />

|Πj(∇h(I j (tj+1)) − ωj+1)| ≤ |∇h(I j (tj+1)) − ωj+1| · TjrjL −12τ<br />

j<br />

and so Πj(ωj+1) is non zero, which means that ωj+1 is not a linear combination of<br />

{ω1, . . .,ωj}. This proves (c).<br />

Finally we can write<br />

and hence<br />

|ωj+1 − ωj| ≤ |ωj+1 − ∇h(I j (tj+1))| + |∇h(I j (tj+1)) − ∇h(I j (tj))|<br />

So given any I ∈ Brj+1 (ωj+1), we have<br />

+ |∇h(I j (tj)) − ∇h(I j−1 (tj))| + |∇h(I j−1 (tj)) − ωj|,<br />

|ωj+1 − ωj|


3.3 - Proof of Theorem 3.6 81<br />

(iii) TjrjL −1τ<br />

j ·< rj, for j ∈ {1, . . .,n − 1};<br />

(iv) mTjrj ·


82 Generic exponential stability without small divisors<br />

So we need to choose positive constants aj, j ∈ {1, . . ., n}, a and b such that the previous<br />

conditions are satisfied. First note that by (i ′ ), the sequence aj, for j ∈ {1, . . ., n}, has<br />

to be increasing, hence<br />

max ai = an.<br />

i∈{1,...,j}<br />

Then using (v ′ ), we observe that (i ′ ) is satisfied if aj+1 = 2τ(n+1)aj for j ∈ {1, . . ., n−1},<br />

that is<br />

aj = (2τ(n + 1)) j−1 a1.<br />

Now for (ii ′ ) to be satisfied, one can choose<br />

so aj, for j ∈ {2, . . ., n}, is determined by<br />

Then, since τ ≥ 2, we may choose<br />

a1 = (2τ(n + 1)) −n ,<br />

aj = (2τ(n + 1)) −n−1+j .<br />

b = 3 −1 a1 = 3 −1 (2τ(n + 1)) −n<br />

and (iii ′ ) easily holds. Finally, we may also choose<br />

a = b = 3 −1 (2τ(n + 1)) −n<br />

so that (iv ′ ) is satisfied. With those values, it is easy to check that (v ′ ), (vi ′ ) and (vii ′ )<br />

holds, recalling that τ ≥ 2 and n ≥ 2. To conclude, just note that (viii ′ ), (ix ′ ), (x ′ ) and<br />

(xi ′ ) are satisfied if ε ≤ ε0 with a sufficiently small ε0 depending on n, R, r, s, M, γ and<br />

τ. This ends the proof.<br />

3.A Proof of the normal form<br />

In this first appendix we will give the proof of the normal form 3.1. We will closely<br />

follow the method of [Pös99b] and deduce our result from an equivalent version in terms<br />

of vector fields (Proposition 3.6 below).<br />

3.A.1 Preliminary estimates<br />

Before giving the proof, we will need some general estimates based on the classical<br />

Cauchy inequality.<br />

3.A.1.1. First consider the case of a function f analytic on some domain Dr,s, and<br />

recall that<br />

|∂θf|r,s = max<br />

1≤i≤n |∂θif|r,s, |∂If|r,s = max |∂Iif|r,s. 1≤i≤n<br />

We take r ′ , s ′ such that 0 < r ′ < r and 0 < s ′ < s. The first estimate is classical, but<br />

we repeat the proof for convenience.<br />

Lemma 3.14. Under the previous assumptions, we have<br />

|∂If|r−r ′ ,s < 1<br />

r ′ |f|r,s,<br />

1<br />

|∂θf|r,s−s ′ <<br />

s<br />

′ |f|r,s.


3.A - Proof of the normal form 83<br />

Proof. For x = (θ, I) ∈ Dr−r ′ ,s and any unit vector v ∈ C n , consider the function<br />

Fx,v : t ∈ C ↦−→ f(θ, I + tv) ∈ C.<br />

This function is well-defined and holomorphic on the disc |t| < r ′ , so the classical Cauchy<br />

estimate gives<br />

|F ′ x,v(0)| < 1<br />

|f|r,s,<br />

r ′<br />

from which the inequality for ∂If follows easily by optimizing with respect to x and v.<br />

The estimate for ∂θf is completely similar.<br />

3.A.1.2. Now let j ∈ {1, . . .,n}, and let f and g be analytic functions defined on the<br />

domain<br />

Drj,sj (ωj) = {(θ, I) ∈ Drj,sj | |∇h(I) − ωj| 1, so we have the inequality<br />

|Xf|rj,sj ≤ ||Xf||rj,sj and the equality holds if f is integrable. Moreover, note that<br />

−1<br />

each norm || . ||rj,sj is normalized with s1r1 (and not with sjr −1<br />

j ): by our inclusions of<br />

domains, this implies in particular that || . ||rj+1,sj+1 ≤ || . ||2rj/3,2sj/3.<br />

It is well-known how to use the Cauchy inequality to estimate the size of the Poisson<br />

bracket {f, g} in terms of f and g. Similarly, our second estimate is concerned with<br />

the size of the vector field [Xf, Xg] in terms of Xf and Xg. We take r ′ , s ′ such that<br />

0 < r ′ < rj and 0 < s ′ < sj.<br />

Lemma 3.16. Under the previous assumptions, we have<br />

and moreover, if g is integrable, then<br />

Proof. First recall that<br />

||[Xf, Xg]||rj−r ′ 1<br />

,sj−s ′ < ||Xf||rj,sj ||Xg||rj,sj ,<br />

r ′<br />

||[Xf, Xg]||rj−r ′ 1<br />

,sj−s ′ < ||Xf||rj,sj ||Xg||rj .<br />

s ′<br />

[Xf, Xg] = d<br />

dt (Φg<br />

<br />

<br />

<br />

t) ∗ Xf<br />

t=0<br />

Now fix x ∈ Drj−r ′ ,sj−s ′, and let us define the vector-valued function<br />

Fx : t ∈ C ↦−→ (Φ g<br />

t) ∗ Xf(x) ∈ C 2n .<br />

.


84 Generic exponential stability without small divisors<br />

Clearly, the map Φ g<br />

t is analytic, and it sends Drj−r ′ ,sj−s ′(ωj) into Drj,sj (ωj) for complex<br />

values of t satisfying<br />

|t| < r ′ ||Xg|| −1<br />

rj,sj ,<br />

hence the function Fx is well-defined and analytic on the disc |t| < r ′ ||Xg|| −1 . So rj,sj<br />

applying the classical Cauchy estimate to each component of Fx and optimizing with<br />

respect to x ∈ Drj−r ′ ,sj−s ′ we obtain the desired inequality<br />

||[Xf, Xg]||rj−r ′ 1<br />

,sj−s ′ < ||Xg||rj,sj ||Xf||rj,sj .<br />

r ′<br />

In case g is integrable, the map Φ g<br />

t leaves invariant the action components, so the same<br />

reasoning can be applied on the larger disc<br />

giving the improved estimate<br />

3.A.2 Proof of Proposition 3.1<br />

|t| < s ′ ||Xg|| −1<br />

rj,sj ,<br />

||[Xf, Xg]||rj−r ′ 1<br />

,sj−s ′ < ||Xf||rj,sj ||Xg||rj .<br />

s ′<br />

Now we can pass to the proof of Proposition 3.1. Given ˜ε > 0 which will be the size of<br />

our perturbating vector field Xf, let us introduce a slightly modified set of conditions<br />

( Ãj), for j ∈ {1, . . .,n}, where ( Ã1) is<br />

<br />

mT1˜ε ·< r1, mT1r1 ·< s1, 0 < r1


3.A - Proof of the normal form 85<br />

with {gj, li} = 0 for i ∈ {1, . . ., j}, and the estimates<br />

||Xgj ||2rj/3,2sj/3


86 Generic exponential stability without small divisors<br />

Lemma 3.17 (First iterative lemma). Consider H = h+g+f on the domain Dr1,s1(ω1),<br />

with h integrable, {g, l1} = 0, and assume that<br />

If we have<br />

||Xg||r1,s1


3.A - Proof of the normal form 87<br />

and hence ||Xχ||r1,s1 < T1˜ε. By the hypo<strong>thesis</strong> T1˜ε < r ′ < s ′ our transformation ϕ1<br />

maps Dr1−r ′ ,s1−s ′(ω1) into Dr1,s1(ω1) and<br />

|ϕ1 − Id|r1−r ′ ,s1−s ′ < T1˜ε.<br />

Therefore it remains to estimate the vector field<br />

1<br />

Xf+ = (Φ<br />

0<br />

χ<br />

t ) ∗ [Xh−l1 + Xg + Xft, Xχ]dt,<br />

and for that it is enough to estimate the brackets [Xft, Xχ], [Xg, Xχ] and [Xh−l1, Xχ].<br />

Using Lemma (3.16), we find<br />

and<br />

||[Xft, Xχ]||r1−r ′ 1<br />

,s1−s ′ <<br />

r ′ ||[Xft||r1,s1||Xχ||r1,s1


88 Generic exponential stability without small divisors<br />

Proof. Our Hamiltonian is H = h + g + f, h is integrable and we have {g, li} = 0 for<br />

i ∈ {1, . . .,j + 1} and {f, li ′} = 0 for i′ ∈ {1, . . ., j}. Once again, our transformation<br />

ϕj+1 = Φ χ<br />

1 will be the time-one map of the Hamiltonian flow generated by some auxiliary<br />

function χ.<br />

We choose<br />

χ = 1<br />

Tj+1<br />

Tj+1<br />

0<br />

(f − [f]j+1) ◦ Φ lj+1<br />

t tdt, (28)<br />

where [.]j+1 is the averaging along the Hamiltonian flow of lj+1. Introducing the notation<br />

ft = tf + (1 − t)[f]j+1, like in Lemma 3.17 we have<br />

with<br />

g+ = g + [f]j+1, f+ =<br />

H ◦ ϕj+1 = h + g+ + f+<br />

1<br />

0<br />

{h − lj+1 + g + ft, χ} ◦ Φ χ<br />

t dt.<br />

We need to verify that we still have {g+, li} = 0 for i ∈ {1, . . .,j + 1} and {f+, li ′} = 0<br />

for i ′ ∈ {1, . . ., j}. By definition, {[f]j+1, lj+1} = 0, and for i ′ ∈ {1, . . ., j}, we compute<br />

{[f]j+1, li ′} =<br />

=<br />

=<br />

= 0.<br />

1<br />

Tj+1<br />

{f ◦ Φ lj+1<br />

t , li ′}dt<br />

Tj+1 0<br />

Tj+1 1<br />

{f ◦ Φ<br />

Tj+1 0<br />

lj+1<br />

t , li ′ ◦ Φlj+1 t }dt<br />

Tj+1<br />

1<br />

Tj+1<br />

0<br />

{f, li ′} ◦ Φlj+1 t dt<br />

This proves that {g+, li} = {g + [f]j+1, li} = 0 for i ∈ {1, . . .,j + 1}. Now a completely<br />

similar calculation shows that for i ′ ∈ {1, . . ., j}, {χ, li ′} = 0, hence li ′ ◦ Φχt<br />

= li ′ and<br />

therefore<br />

{f+, li ′} =<br />

1<br />

0<br />

{{h − lj+1 + g + ft, χ}, li ′} ◦ Φχt<br />

dt.<br />

The double bracket in the expression above is zero, as a consequence of Jacobi identity<br />

and the fact that {h−lj+1+g+ft, li ′} = {χ, li ′} = 0, hence {f+, li ′} = 0 for i′ ∈ {1, . . ., j}.<br />

To conclude, using our hypo<strong>thesis</strong> Tj+1˜ε ·< r ′ ·< s ′ , as in Lemma 3.17 we can<br />

show that our transformation ϕj+1 maps Drj+1−r ′ ,sj+1−s ′(ωj+1) into Drj+1,sj+1 (ωj+1) with<br />

|ϕj+1 − Id|rj+1−r ′ ,sj+1−s ′


3.A - Proof of the normal form 89<br />

Proof of Proposition 3.6. The proof is by induction on j ∈ {1, . . .,n}.<br />

First step. Here we assume ( Ã1) and we will apply m times our first iterative<br />

Lemma 3.17, starting with the Hamiltonian<br />

H 0 = H = h + g 0 + f 0<br />

where g 0 = 0 and f 0 = f and choosing uniformly at each step<br />

r ′ = (3m) −1 r1, s ′ = (3m) −1 s1.<br />

Since m ≥ 1, we have 0 < r ′ < r1, 0 < s ′ < s1 and using ( Ã1), we have<br />

T1˜ε < r ′ < s ′ ,<br />

so that the lemma can indeed be applied at each step. For i ∈ {0, . . ., m − 1}, the<br />

Hamiltonian H i = h + g i + f i at step i is transformed into<br />

H i+1 = H i ◦ ϕ i 1 = h + gi+1 + f i+1 .<br />

For each i ∈ {0, . . ., m}, we obviously have {g i , l1} = 0 and we claim that the estimates<br />

||X g i|| r i 1 ,s i 1


90 Generic exponential stability without small divisors<br />

assume that the estimates (29) are satisfied for each k ≤ i, where i ∈ {0, . . .,m − 1}.<br />

For k ∈ {0, . . ., i}, since ||X f k|| r k 1 ,s k 1 < ˜εk we get that<br />

and therefore<br />

||Xgk+1 − Xgk|| k+1<br />

r1 ,s k+1 < ˜εk,<br />

1<br />

||X g i+1|| r i+1<br />

1 ,s i+1<br />

1 ≤<br />

i<br />

˜εk


3.B - SDM functions 91<br />

with g 0 j = 0 and f0 j = gj, we can apply m times our second iterative Lemma 3.18 to<br />

have the following: there exists an analytic symplectic transformation<br />

Φj+1 : D2rj+1/3,2sj+1/3(ωj+1) → Drj+1,sj+1 (ωj+1)<br />

of the form Φj+1 = ϕ 0 j+1 ◦ · · · ◦ ϕ m−1<br />

j+1 such that |Φj+1 − Id|2rj+1/3,2sj+1/3 ·< rj+1 and<br />

(h + gj) ◦ Φj+1 = h + g m j + f m j ,<br />

with {g m j , li} = 0 for i ∈ {1, . . ., j + 1}, and the estimates<br />

Now we set<br />

||Xg m j ||2rj+1/3,2sj+1/3


92 Generic exponential stability without small divisors<br />

3.B.1 Steepness<br />

3.B.1.1. We denote by GAB(n, k) the set of all affine subspaces of R n of dimension k<br />

intersecting the ball B, and by GA L B (n, k) those subspaces with direction in GL (n, k) (the<br />

latter is the space of linear subspaces of R n of dimension k whose orthogonal complement<br />

is spanned by integer vectors of length less than or equal to L). Let us recall the classical<br />

steepness condition, originally introduced by N.N. Nekhoroshev ([Nek77]).<br />

Definition 3.19. A function h ∈ C 2 (B) is said to be steep if it has no critical points<br />

and if for any k ∈ {1, . . ., n − 1}, there exist an index pk > 0 and coefficients Ck > 0,<br />

δk > 0 such that for any affine subspace λk ∈ GAB(n, k) and any continuous curve Γ<br />

from [0, 1] to λk ∩ B with<br />

Γ(0) − Γ(1) = r < δk,<br />

there exists t∗ ∈ [0, 1] such that:<br />

<br />

Γ(t) − Γ(0) < r, t ∈ [0, t∗],<br />

ΠΛk (∇h(Γ(t∗))) > Ckr pk<br />

where ΠΛk is the projection onto Λk, the direction of λk.<br />

The function is said to be symmetrically steep (or shortly S-steep) if the above property<br />

is also satisfied for k = n, with an index pn > 0 and coefficients Cn > 0, δn > 0.<br />

Let us remark that S-steep functions are allowed to have critical points. Those<br />

definitions are rather obscure, but in fact it can be given a simpler and more geometric<br />

interpretation, as was shown by Ilyashenko ([Ily86]) and Niederman ([Nie06]). Important<br />

examples of steep functions are given by the class of strictly convex (or quasi-convex)<br />

functions, with all the steepness indices equal to one.<br />

3.B.1.2. A typical example of non-steep function, which is due to Nekhoroshev, is<br />

h(I1, I2) = I 2 1 − I2 2 , and it is not exponentially stable: for the perturbation hε(I1, I2) =<br />

I 2 1 − I2 2 + ε sin(I1 + I2), any solution with I1(0) = I2(0) has a fast drift, that is a drift<br />

of order one on a time scale of order ε−1 (this is obviously the fastest drift possible).<br />

) we recover<br />

But adding a third order term in the previous example (for example I3 2<br />

steepness, and this is in fact a general phenomenon. Indeed, non-steep functions has<br />

infinite codimension among smooth functions, or more precisely, if Jr(n) is the space of<br />

r-jets of C∞ functions on an open set of Rn , then Nekhoroshev proved in [Nek79] that the<br />

set of r-jets of non-steep functions is an algebraic subset of Jr(n) which codimension goes<br />

to infinity has r goes to infinity. In this sense, steep functions are "generic". However,<br />

for n ≥ 3, a quadratic Hamiltonian is steep only if it is sign definite, which is a strong<br />

assumption, and more generally a polynomial is generically steep only if its degree is<br />

sufficiently high (of order n2 if n is the number of degrees of freedom). Hence polynomials<br />

of lower degree are generically non-steep (see [LM88]). This is clearly a shortcoming,<br />

and we will see at the end of the next section the advantage of our genericity condition.<br />

3.B.1.3. Steepness (or S-steepness) is a sufficient condition to ensure exponential<br />

stability, but this is not necessary, as was first noticed by Morbidelli and Guzzo (see<br />

[MG96]). They considered the Hamiltonian h(I1, I2) = I2 1 − αI2 2 , which is non-steep for


3.B - SDM functions 93<br />

any value of α > 0, and noticed that a "fast drift" is not possible if √ α is "strongly"<br />

irrational. Therefore a Diophantine condition on √ α should ensure exponential stability.<br />

Such considerations were then generalized by Niederman who introduced the class<br />

of "Diophantine Morse" functions and who proved that they are exponentially stable<br />

([Nie07b]). The only difference between these functions and the "Simultaneous Diophantine<br />

Morse" functions we use in this chapter is that Diophantine Morse functions<br />

consider subspaces in GL(n, k), which are generated by integer vectors of length bounded<br />

by L, while here we are looking at subspaces in G L (n, k) where the latter condition is<br />

imposed on the orthogonal complement. This reflects the difference between the method<br />

of proof: in ([Nie07b]) the analytic part was based on classical small divisors techniques<br />

(that is linear Diophantine approximation) and therefore required an adapted geometric<br />

assumption, while here we simply rely on the most basic theorem of simultaneous Diophantine<br />

approximation (and this explains the name Simultaneous Diophantine Morse<br />

functions).<br />

3.B.1.4. In both cases, the use of such a class of functions has two advantages. The<br />

first one is that these functions are generic in a much more clearer sense than steep<br />

functions, and this will be explained in the next section. The second advantage is that<br />

they are in some sense more general than the usual steep functions, since we only have<br />

to consider curves in some specific affine subspaces. This is explained in the proposition<br />

below.<br />

Proposition 3.7. Let h ∈ SDM τ γ (B), assume that |h| C3 (B) < M and take r < 1. Then<br />

for any affine subspace λ ∈ GAL B (n, k) and any continuous curve Γ from [0, 1] to Λ ∩ B<br />

with<br />

Γ(0) − Γ(1) = r < (2M) −1 γL −τ ,<br />

there exists t∗ ∈ [0, 1] such that:<br />

<br />

Γ(t) − Γ(0) ≤ r, t ∈ [0, t∗],<br />

ΠΛ(∇h(Γ(t∗))) > 1<br />

2 r2<br />

where ΠΛ is the projection onto Λ, the direction of λ.<br />

Proof. It is enough to check that these properties are satisfied for a vector space Λ ∈<br />

GL (n, k), since any affine subspace λ ∈ GAL B (n, k) is of the form λ = w + Λ with<br />

Λ ∈ GL (n, k) for some vector w. So consider a continuous curve Γ from [0, 1] to Λ ∩ B<br />

with length r < 1 satisfying<br />

Γ(0) − Γ(1) = r < (2M) −1 γL −τ .<br />

We will denote by (u(t), v) the coordinates of Γ(t) for t ∈ [0, 1] in a basis adapted to the<br />

orthogonal decomposition Λ ⊕ Λ ⊥ . Therefore<br />

ΠΛ(∇h(Γ(t))) = ∂uhΛ(u(t), v)<br />

for all t ∈ [0, 1]. We will distinguish distinguish two cases.<br />

For the first one, we suppose that<br />

∂uhΛ(u(0), v) > 2 −1 r 2 ,


94 Generic exponential stability without small divisors<br />

so the conclusion trivially holds for t∗ = 0.<br />

For the second one, we have<br />

but since r 2 < r < γL −τ , this gives<br />

∂uhΛ(u(0), v) ≤ 2 −1 r 2 , (30)<br />

∂uhΛ(u(0), v) ≤ γL −τ .<br />

Now h ∈ SDM τ γ (B), so we can apply the definition at the point (u(0), v), and for any<br />

η ∈ Rk \ {0} we obtain<br />

∂uuhΛ(u(0), v).η > γL −τ η. (31)<br />

Take any ũ such that ũ − u(0) < (2M) −1 γL −τ . We can apply Taylor formula with<br />

integral remainder to obtain<br />

∂uhΛ(ũ, v) − ∂uhΛ(u(0), v) =<br />

1<br />

Now since M bounds the third derivative of h, we have<br />

0<br />

∂uuhΛ(u(0) + t(ũ − u(0)), v).(ũ − u(0))dt.<br />

∂uuhΛ(u(0) + t(ũ − u(0)), v) − ∂uuhΛ(u(0), v) ≤ Mtũ − u(0) ≤ 2 −1 γL −τ t,<br />

and this yields<br />

∂uhΛ(ũ, v) − ∂uhΛ(u(0), v) ≥ ∂uuhΛ(u(0), v).(ũ − u(0))<br />

which in turns, using (31) with η = ũ − u(0), gives<br />

Now we define<br />

∂uhΛ(ũ, v) − ∂uhΛ(u(0), v) ≥<br />

so trivially we have<br />

Furthermore, we have<br />

−2 −1 γL −τ<br />

1<br />

<br />

γL −τ − 2 −1 γL −τ<br />

0<br />

tũ − u(0)dt,<br />

1<br />

<br />

tdt ũ − u(0)<br />

0<br />

≥ 2 −1 γL −τ ũ − u(0). (32)<br />

t∗ = inf {Γ(t) − Γ(0) = r},<br />

t∈[0,1]<br />

Γ(t) − Γ(0) ≤ r, t ∈ [0, t∗].<br />

∂uhΛ(u(t∗), v) ≥ ∂uhΛ(u(t∗), v) − ∂uhΛ(u(0), v) − ∂uhΛ(u(0), v),<br />

and so using (30), (32) and recalling that u(t∗) − u(0) = r and γL −τ > 2r we obtain<br />

and this is the desired estimate.<br />

∂uhΛ(u(t∗), v) ≥ 2 −1 γL −τ r − 2 −1 r 2<br />

> r 2 − 2 −1 r 2<br />

= 2 −1 r 2 ,


3.B - SDM functions 95<br />

3.B.2 Prevalence<br />

3.B.2.1. Here we will prove our results of genericity concerning SDM functions, that is<br />

Theorem 3.4 and Corollary 3.5. Our main tool is the following lemma, which is proved<br />

in [Nie07b] and relies on the quantitative Morse-Sard theory developed by Yomdin (see<br />

[YC04] and [Yom83]).<br />

Lemma 3.20. Let κ ∈]0, 1[ and g ∈ C2n+1 (B, Rk ). There exist a positive constant ck<br />

and a subset Cκ ⊆ Rk such that<br />

√<br />

λk(Cκ) ≤ ck κ,<br />

and for any ζ /∈ Cκ, the function g ζ defined by g ζ (x) = g(x) − ζ satisfies the following:<br />

for any x ∈ B,<br />

g ζ (x) ≤ κ =⇒ dg ζ (x).ν > κν,<br />

for any ν ∈ R n \ {0}.<br />

In the above statement, the set Cκ is a "nearly-critical set" for the function g.<br />

3.B.2.2. Let us prove Theorem 3.4.<br />

Proof of Theorem 3.4. Recall that we are given a function h ∈ C 2n+2 (B). The proof is<br />

divided in two steps: first, we will describe the set of parameters ξ ∈ R n for which the<br />

function hξ, defined by hξ(I) = h(I) − ξ.I, is not in SDM τ (B), and then, in a second<br />

step, we will show that this set has zero Lebesgue measure, for τ > 2(n 2 + 1). In the<br />

sequel, given k ∈ {1, . . ., n}, we denote by λk the Lebesgue measure of R k .<br />

First step. Given an element Λ ∈ G L (n, k), let ΠΛ the projection onto this subspace<br />

and consider the associate function hΛ (recall that hΛ is just the function h written<br />

in coordinates adapted to the orthogonal decomposition Λ ⊕ Λ ⊥ ). Let us define the<br />

function<br />

g = ∂uhΛ,<br />

which belongs C 2n+1 (B, R k ), and apply to this function Lemma 3.20 with the value<br />

κ = γL −τ . We find a "nearly-critical" set Cκ = Cγ,τ,L ⊆ R k with the measure estimate<br />

such that for any ζ /∈ Cγ,τ,L and any (u, v) ∈ B,<br />

for any ν ∈ R n \ {0}.<br />

λk(Cγ,τ,L) ≤ ckγ 1 τ<br />

2L<br />

− 2, (33)<br />

g ζ (u, v) ≤ κ =⇒ dg ζ (u, v).ν > κν, (34)<br />

Now take any ζ /∈ Cγ,τ,L, any ξ ∈ Π −1<br />

Λ (ζ) and consider the modified function hξ as<br />

well as its version hξ,Λ. Since<br />

∂uhξ,Λ = ∂uhΛ − ζ = g − ζ = g ζ ,<br />

and ∂uuhξ,Λ = ∂uuhΛ is just some restriction of dg, the estimate (34) gives for any<br />

(u, v) ∈ B,<br />

∂uhξ,Λ(u, v) ≤ γL −τ =⇒ ∂uuhξ,Λ(u, v).η > γL −τ η (35)


96 Generic exponential stability without small divisors<br />

for any η ∈ R k \ {0}. So let Cγ,τ,L,Λ = Π −1<br />

Λ (Cγ,τ,L), and define<br />

Cγ,τ = <br />

<br />

<br />

L∈N∗ k∈{1,...,n} Λ∈GL (n,k)<br />

Cγ,τ,L,Λ.<br />

As a consequence of the estimate (35), the function hξ ∈ SDM τ γ (B) provided that<br />

ξ /∈ Cγ,τ, hence hξ ∈ SDM τ (B) provided that ξ /∈ Cτ, where<br />

Cτ = <br />

γ>0<br />

Second step. It remains to prove that Cτ has zero Lebesgue measure under our<br />

assumption that τ > 2(n 2 + 1). For an integer m ∈ N ∗ , we define C m γ,τ,L,Λ (resp. Cm γ,τ<br />

and C m τ ) as the intersection of Cγ,τ,L,Λ (resp. Cγ,τ and Cτ) with the ball of R n of radius<br />

m centered at the origin. As a consequence of (33) and Fubini-Tonelli theorem, one has<br />

Cγ,τ.<br />

λn(C m γ,τ,L,Λ) ≤ Vn,mckγ 1 τ<br />

2L<br />

− 2<br />

where Vn,m = mnπn/2Γ(n/2 + 1) −1 is the volume of the ball of Rn of radius m centered<br />

at the origin. Therefore<br />

⎛<br />

⎝ <br />

⎞<br />

⎠ ≤ |G L τ<br />

−<br />

(n, k)|Vn,mckL 2γ 1<br />

2,<br />

λn<br />

Λ∈G L (n,k)<br />

C m γ,τ,L,Λ<br />

with |G L (n, k)| the cardinal of G L (n, k). But obviously |G L (n, k)| ≤ L n2<br />

Now<br />

and so<br />

λn<br />

⎛<br />

λn<br />

⎝ <br />

⎛<br />

⎝ <br />

Λ∈G L (n,k)<br />

<br />

k∈{1,...,n} Λ∈GL (n,k)<br />

λn(C m γ,τ ) ≤ Vn,m<br />

⎞<br />

C m ⎠<br />

γ,τ,L,Λ ≤ Vn,mckL n2− τ<br />

2γ 1<br />

C m γ,τ,L,Λ<br />

n<br />

k=1<br />

⎞<br />

⎠ ≤ Vn,m<br />

ck<br />

+∞<br />

L=1<br />

n<br />

k=1<br />

ck<br />

L n2 − τ<br />

2<br />

<br />

<br />

2.<br />

L n2− τ<br />

2γ 1<br />

2,<br />

γ 1<br />

2<br />

and hence<br />

where the sum in the right-hand side of the last estimate is finite since we are assuming<br />

τ > 2(n 2 + 1). This shows that<br />

and as Cτ = <br />

m≥1 Cm τ<br />

and this concludes the proof.<br />

λn(C m τ ) = inf<br />

γ>0 λn(C m γ,τ) = 0,<br />

we finally obtain<br />

λn(Cτ) = 0,


3.B - SDM functions 97<br />

3.B.2.3. As we mentioned in the introduction, there is a notion of genericity in infinite<br />

dimensional vector spaces called prevalence, first introduced in a different setting by<br />

Christensen ([Chr73]) and rediscovered by Hunt, Sauer and Yorke ([HSY92], see also<br />

[KH08] and [OY05]).<br />

Definition 3.21. Let E be a completely metrizable topological vector space. A Borel<br />

subset S ⊆ E is said to be shy if there exists a Borel measure µ on E, with 0 < µ(C) < ∞<br />

for some compact set C ⊆ E, and such that µ(x + S) = 0 for all x ∈ E.<br />

An arbitrary set is called shy if it is contained in a shy Borel subset, and finally the<br />

complement of a shy set is called prevalent.<br />

The following "genericity" properties are easy to check ([OY05]): a prevalent set is<br />

dense, a set containing a prevalent set is also prevalent, and prevalent sets are stable<br />

under translation and countable intersection.<br />

Furthermore, we have an easy but useful criterion for a set to be prevalent.<br />

Proposition 3.8 ([HSY92]). Let S be a subset of E. Suppose there exists a finitedimensional<br />

subspace F of E such that x + S has full λF-measure for all x ∈ E. Then<br />

S is prevalent.<br />

3.B.2.4. Now we can prove our Corollary 3.5.<br />

Proof of Corollary 3.5. Let E = C 2n+2 (B), S = SDM τ (B) for τ > 2(n 2 +1) and F the<br />

space of linear forms of R n restricted to B. Then F is a linear subspace of C 2n+2 (B) of<br />

dimension n, and the conclusion follows immediately from Theorem 3.4 and the above<br />

Proposition 3.8.<br />

3.B.2.5. To conclude, let us compare our generic condition with the usual steepness<br />

property. First, our condition is prevalent in the space C k (B), with k ≥ 2n+2, and this<br />

is not true for steep functions. But more importantly, as prevalence is nothing but "full<br />

Lebesgue measure" in finite dimension, given any non zero integers m and n, Lebesgue<br />

almost all polynomial Hamiltonian hm of degree m with n degrees of freedom is SDM,<br />

but not steep unless m is of order n 2 . This remark turns out to be very useful when<br />

studying the stability of invariant tori under generic conditions (see the next chapter).


98 Generic exponential stability without small divisors


4.1 - Introduction and main results 99<br />

4 Generic super-exponential stability for invariant<br />

tori<br />

In this chapter, we consider solutions starting close to some linearly stable invariant<br />

tori in an analytic Hamiltonian system and we prove results of stickiness, that is of<br />

stability for a super-exponentially long interval of time, under generic conditions. The<br />

proof combines classical Birkhoff normal forms and the new method to obtain generic<br />

Nekhoroshev’s estimates developed in the first chapter. We will mainly focus on the<br />

neighbourhood of elliptic fixed points, since with our approach the other cases are completely<br />

similar. This chapter is based on [Bou09b].<br />

4.1 Introduction and main results<br />

In this chapter, we are interested in the stability properties of some linearly stable<br />

invariant tori in analytic Hamiltonian systems. Let us begin by the case of elliptic fixed<br />

points.<br />

4.1.0.1. As the problem is local, it is enough to consider a Hamiltonian H defined and<br />

analytic on an open neighbourhood of 0 in R 2n , having the origin as a fixed point. Up<br />

to an irrelevant additive constant and expanding the Hamiltonian as a power series at<br />

the origin, we can write<br />

H(z) = H2(z) + V (z),<br />

where z is sufficiently close to 0 in R 2n , H2 is the quadratic part of H at 0 and V (z) =<br />

O(||z|| 3 ). Recall that the fixed point is said to be elliptic if the spectrum of the linearized<br />

system is purely imaginary. Then it has the form {±iα1, . . .,±iαn}, for some vector<br />

α = (α1, . . .,αn) ∈ R n which is called the normal (or characteristic) frequency. Due to<br />

the symplectic character of the equations, such equilibria are the only linearly stable<br />

fixed points. Now we assume that the components of α are all distinct so that we can<br />

make a symplectic linear change of variables that diagonalizes the quadratic part:<br />

H(z) =<br />

n<br />

i=1<br />

αi<br />

2 (z2 i + z 2 n+i) + V (z) = α. Ĩ + V (z),<br />

where Ĩ = Ĩ(z) is the vector of "formal action", that is<br />

Ĩ(z) = 1<br />

2 (z2 1 + z 2 n+1, . . .,z 2 n + z 2 2n) ∈ R n .<br />

Assuming the components of α are all of the same sign, it is easy to see that H is a<br />

Lyapunov function so the fixed point is stable. But in the general case, one has to study<br />

the influence of the higher order terms V (z), and we will explain how it can be done<br />

using classical perturbation theory.<br />

4.1.0.2. Let us first note that, given a solution z(t) of H, if Ĩ(t) = Ĩ(z(t)) then<br />

| Ĩ(t)|1 =<br />

n<br />

| Ĩi(t)|<br />

i=1


100 Generic super-exponential stability for invariant tori<br />

is (up to a factor one-half) the square of the Euclidean distance of z(t) to the origin, so<br />

that Lyapunov stability can be proved if | Ĩ(t) − Ĩ(0)|1 does not vary much for all times.<br />

Now in order to study the dynamics on a small neighbourhood of size ρ around the<br />

origin in R 2n , it is more convenient to change coordinates by performing the standard<br />

scalings<br />

z ↦−→ ρz, H ↦−→ ρ −2 H,<br />

to have a Hamiltonian defined on a fixed neighbourhood of zero in R 2n . Then, by<br />

analyticity, we extend the resulting Hamiltonian to a holomorphic function on some<br />

complex neighbourhood of zero in C 2n . So eventually we will consider the following<br />

setting: we define the Euclidean ball in C 2n<br />

Ds = {z ∈ C 2n | ||z|| < s}<br />

of radius s around the origin, and if As is the space of holomorphic functions on Ds<br />

which are real valued for real arguments, endowed with its usual supremum norm | . |s,<br />

we consider <br />

H(z) = α. Ĩ + f(z)<br />

(A)<br />

H ∈ As, |f|s < ρ.<br />

Let us emphasize that the small parameter ρ, which was originally describing the size<br />

of the neighbourhood of 0, now describes the size of the "perturbation" f on a neighbourhood<br />

of fixed size s. Without loss of generality, we may assume s > 3.<br />

4.1.0.3. Probably the main tool to investigate stability properties is the construction of<br />

normal forms using averaging methods, and in this case these are the so-called Birkhoff<br />

normal forms. For an integer m ≥ 1, assuming α is non-resonant up to order 2m, that<br />

is<br />

k.α = 0, k ∈ Z n , 0 < |k|1 ≤ 2m,<br />

there exists an analytic symplectic transformation Φm close to identity such that H ◦Φm<br />

is in Birkhoff normal form up to order 2m, that is<br />

H ◦ Φm(z) = hm( Ĩ) + fm(z),<br />

where hm is a polynomial of degree at most m in the Ĩ variables, and the remainder fm<br />

is roughly of order ρ 2m−1 (since before the scaling fm(z) is of order z 2m+1 , see [Bir66],<br />

or [Dou88] for a more recent exposition). The polynomials hm are uniquely defined once<br />

α is fixed, and are usually called the Birkhoff invariants. Therefore the transformed<br />

Hamiltonian is the sum of an integrable part hm, for which the origin is trivially stable,<br />

since Ĩ(t) is constant for all times, and a smaller perturbation fm. Moreover, if α is<br />

non-resonant up to any order, we can even define a formal symplectic transformation<br />

Φ∞ and a formal power series h∞ = <br />

k≥1 hk , with hm = m<br />

k=1 hk , such that<br />

H ◦ Φ∞(z) = h∞( Ĩ).<br />

In general the series h∞ is divergent (this is a result of Siegel) and the convergence properties<br />

of the transformation Φ∞ are even more subtle (see [PM03]). However, Birkhoff<br />

normal forms at finite order are still very useful, not only because the "perturbation" fm<br />

is made smaller, but also because the "integrable" part hm, for m ≥ 2, is now non-linear<br />

and other classical techniques from perturbation theory can be used.


4.1 - Introduction and main results 101<br />

4.1.0.4. First, in the case n = 2, a complete result of stability follows from KAM<br />

theory. Indeed, if the frequency α ∈ R 2 is non-resonant up to order 4, the Birkhoff<br />

normal form reads<br />

H(z) = α. Ĩ + βĨ.Ĩ + f2(z),<br />

with β a symmetric matrix of size n = 2 and f2 a small perturbation. This time we<br />

consider the non-linear part h2( Ĩ) = α.Ĩ + βĨ.Ĩ as the integrable system, and if it is<br />

isoenergetically non degenerate, the persistence of two-dimensional tori in each energy<br />

level close to the fixed point implies Lyapunov stability (see [AKN06], [Arn63b], or<br />

[Arn61] for other results).<br />

However, for n ≥ 3, it is believed that "generic" elliptic fixed points are unstable,<br />

although this is totally unclear for the moment (see [DLC83], [Dou88] and [KMV04]).<br />

Therefore, for n ≥ 3, stability results under general assumptions can only concern<br />

finite but hopefully long intervals of time, and this is the content of the chapter. More<br />

precisely, we will prove, under generic assumptions and provided ρ is sufficiently small,<br />

that for all initial conditions the variation | Ĩ(t) −Ĩ(0)|1 is of order ρ for t ∈ T(ρ), where<br />

T(ρ) is an interval of time of order exp (exp(ρ −1 )) (see Theorem 4.1 for a precise formulation).<br />

Such a phenomenon of super-stability is sometimes called "stickiness" ([PW94]).<br />

The interpretation in the original coordinates is the following: if a solution starts in<br />

a sufficiently small neighbourhood of the origin, it stays in some larger neighbourhood<br />

during an interval of time which is super-exponentially long with respect to the inverse<br />

of the initial distance to the origin. But first, let us describe previously known results<br />

on exponential stability, where there were basically two strategies.<br />

4.1.0.5. In a first approach, one assumes a Diophantine condition on α, that is there<br />

exist γ > 0 and τ > n − 1 such that<br />

|k.α| ≥ γ|k| −τ<br />

1 , k ∈ Z n \ {0},<br />

but no conditions on the Birkhoff invariants. From the point of view of perturbation<br />

theory, the linear part is considered as the integrable system. In particular, α is<br />

non-resonant up to any order, hence we can perform any finite number of Birkhoff normalizations,<br />

and since we have a control on the small divisors, we can precisely estimate<br />

the size of the remainder fm (in terms of γ and τ). The usual trick is then to optimize<br />

the choice of m as a function of ρ in order to obtain an exponentially small remainder<br />

with respect to ρ −1 . Therefore the exponential stability is immediately read from the<br />

normal form, and this requires only an assumption on the linear part (see [GDF + 89] or<br />

[DG96b]). The above Diophantine condition has full Lebesgue measure. However, as we<br />

will see later, the threshold of the perturbation and the constants of stability are very<br />

sensitive to the Diophantine properties of α, in particular the small parameter γ.<br />

4.1.0.6. The second approach is fundamentally different, and it does not rely on the<br />

arithmetic properties of α. Here, one just assumes that α is non-resonant up to order<br />

4, so that the Hamiltonian reduces to<br />

H(z) = α. Ĩ + βĨ.Ĩ + f2(z).<br />

In this case h2( Ĩ) = α.Ĩ + βĨ.Ĩ is considered as the integrable system (β being a symmetric<br />

matrix of size n). Now we suppose that the non-linear part is convex, which


102 Generic super-exponential stability for invariant tori<br />

is equivalent to β being sign definite. Under those assumptions, it was predicted and<br />

partially proved by Lochak ([Loc92] and [Loc95]), and completely proved independently<br />

by Niederman ([Nie98]) and Fassò, Guzzo and Benettin ([FGB98] and [GFB98]) that<br />

exponential stability holds. Their proofs are based on the implementation of Nekhoroshev’s<br />

estimates in Cartesian coordinates, but they are radically different: the first one<br />

uses Lochak’s method of periodic averaging and simultaneous Diophantine approximations,<br />

while the second one is based on Nekhoroshev’s original mechanism. The proof<br />

of Niederman was later clarified by Pöschel ([Pös99b]). However, the method of Lochak<br />

was restricted to the convex case, and it was not clear how to remove this hypo<strong>thesis</strong> to<br />

have a result valid in a more general context.<br />

4.1.0.7. Here, using the method we introduced in the first chapter, we are able to<br />

replace the convexity condition by a generic assumption. Then, combining both Birkhoff<br />

theory and Nekhoroshev theory as in [MG95], we will obtain the following result.<br />

Theorem 4.1. Suppose H is as in (A), with α non-resonant up to any order. Then<br />

under a generic condition (G) on h∞, there exist positive constants a, a ′ , c1, c2 and ρ0<br />

such that for ρ ≤ ρ0, every solution z(t) of H with | Ĩ(0)|1 < 1 satisfies<br />

| Ĩ(t) − Ĩ(0)|1<br />

<br />

< c1ρ, |t| < exp ρ −a′<br />

exp(c2a ′ ρ −a <br />

) .<br />

Denoting h∞ = <br />

k≥1 hk and hm = m k=1 hk , let us explain our generic condition (G)<br />

on the formal power series h∞. In fact<br />

(G) = <br />

(Gm)<br />

m∈N ∗<br />

consists in countably many conditions, where (Gm) is a condition on hm. The first<br />

condition (G1) requires that h1(I) = α.I with a (γ, τ)-Diophantine vector α. The other<br />

conditions (Gm), for m ≥ 2, are that each polynomial function hm belongs to a special<br />

class of functions called SDM<br />

τ ′<br />

γ<br />

′ which was introduced in the first chapter (SDM stands<br />

for "Simultaneous Diophantine Morse" functions, see Appendix 4.A for a definition). In<br />

this appendix we will show that each condition (Gm) is of full Lebesgue measure in the<br />

finite dimensional space of polynomials of degree m with n variables, assuming τ and<br />

τ ′ are large enough. This is well-known for m = 1, it will be elementary for m = 2 (see<br />

Theorem 4.11) but for m > 2 it requires the quantitative Morse-Sard theory of Yomdin<br />

([Yom83], [YC04], see Proposition 4.3 in the appendix). Let us point out that this would<br />

have not been possible if we had assumed hm, for m ≥ 2, to be steep in the sense of<br />

Nekhoroshev, as polynomials are generically steep only if their degrees are sufficiently<br />

large with respect to the number of degrees of freedom (see [LM88]).<br />

Our condition (G) on the formal series h∞ is therefore of "full Lebesgue measure<br />

at any order". From an abstract point of view, this condition defines a prevalent set<br />

in the space of formal power series, where prevalence is an analog of the notion of full<br />

Lebesgue measure in the context of infinite dimensional vector spaces. This will be<br />

proved in Appendix 4.A, Theorem 4.9. In Theorem 4.1, we can choose the exponents<br />

a = (1 + τ) −1 , a ′ = 3 −1 (2(n + 1)τ ′ ) −n ,<br />

and our threshold ρ0 depends in particular on γ and γ ′ . Moreover our constants c1 and<br />

c2 also depend on γ but not on γ ′ , and we shall be a little more precise later on.


4.1 - Introduction and main results 103<br />

As we have already explained, the proof is based on a combination of Birkhoff normalizations<br />

up to an exponentially small remainder, which are well-known (a statement<br />

is recalled in Proposition 4.1 below), and Nekhoroshev’s estimates for a generic integrable<br />

Hamiltonian near an elliptic fixed point (Theorem 4.3 below). The latter result<br />

is new, and it will follow rather easily from the new approach of Nekhoroshev theory in<br />

a generic case taken in the first chapter.<br />

4.1.0.8. As a direct consequence of our Nekhoroshev’s estimates near an elliptic fixed<br />

point, we can derive an exponential stability result more general than those obtained in<br />

[FGB98] and [Nie98]. Like in those papers, we only require α to be non-resonant up to<br />

order 4, and after the scalings<br />

z ↦−→ ρz, H ↦−→ ρ −4 H, α ↦−→ ρ 2 α,<br />

we consider <br />

H(z) = α. Ĩ + βĨ.Ĩ + f(z)<br />

H ∈ As, |f|s < ρ.<br />

However, instead of assuming that β is sign definite, our result applies to Lebesgue<br />

almost all symmetric matrices β without any condition on α. Let Sn(R) be the space<br />

of symmetric matrices of size n with real entries.<br />

Theorem 4.2. Suppose H is as in (B). For Lebesgue almost all β ∈ Sn(R), there<br />

exist positive constants a ′ , b ′ and ρ0 such that, for ρ ≤ ρ0, every solution z(t) of H with<br />

| Ĩ(0)|1 < 1 satisfies<br />

| Ĩ(t) − Ĩ(0)|1 < n(n 2 + 1)ρ −b′<br />

, |t| < exp(ρ −a′<br />

).<br />

The above theorem is a direct consequence of Theorem 4.3 below, provided that<br />

′<br />

h2( Ĩ) = α.Ĩ + βĨ.Ĩ belongs to SDMτ γ ′ . But we will prove in Appendix 4.A that this<br />

happens for almost all symmetric matrices β, independently of α (see Theorem 4.11).<br />

Once again, let us also mention that this result is not possible in the steep case, as<br />

the quadratic part h2 ( Ĩ) = βĨ.Ĩ is steep if and only if β is sign definite. In the above<br />

statement one can choose the exponents<br />

and the threshold ρ0 depends on γ ′ .<br />

a ′ = b ′ = 3 −1 (2(n + 1)τ ′ ) −n ,<br />

4.1.0.9. Let us add that in order to avoid useless expressions, we will only keep track<br />

of the small parameters ρ, γ and γ ′ and replace any other positive constants by a dot<br />

(·) when it is convenient.<br />

Moreover, in this text we shall use various norms for vectors v ∈ R n or v ∈ C n : | . |<br />

will be the supremum norm, | . |1 the ℓ1-norm and . the Euclidean (or Hermitian)<br />

norm.<br />

4.1.0.10. Let us now describe the plan of this chapter. Section 4.2 is devoted to the<br />

proof of Theorem 4.1 and Theorem 4.2. In 4.2.1, we give a statement of the Birkhoff<br />

normal form up to an exponentially small remainder. In 4.2.2, we will explain how<br />

Nekhoroshev’s estimates obtained in the first chapter generalize in the neighbourhood<br />

(B)


104 Generic super-exponential stability for invariant tori<br />

of elliptic fixed points, and how they imply Theorem 4.2. In 4.2.3, we will show how<br />

Theorem 4.1 follows from a simple combination of Birkhoff’s estimates and Nekhoroshev’s<br />

estimates, provided our assumption on h∞ is satisfied. Then, in section 4.3, we<br />

will state similar results for invariant Lagrangian tori and more generally for invariant<br />

linearly stable isotropic reducible tori. Finally, an appendix is devoted to our genericity<br />

assumptions.<br />

4.2 Proof of Theorem 4.1 and Theorem 4.2<br />

In the sequel, we recall that we will use the "formal" actions<br />

Ĩ =<br />

Ĩ(z) = 1<br />

2 (z2 1 + z2 n+1 , . . ., z2 n + z2 2n ) ∈ Rn ,<br />

but one has to remember that these are nothing but notations for expressions in z ∈ R 2n .<br />

Moreover, we will also need to use complex coordinates for the normal forms, and,<br />

abusing notations, we will also denote them by z ∈ C 2n , but of course the solutions we<br />

consider are real.<br />

4.2.1 Birkhoff’s estimates<br />

Here we consider a Hamiltonian as in (A), and we assume that the vector α is (γ, τ)-<br />

Diophantine. In this context, the following result is classical.<br />

Proposition 4.1. Under the previous assumptions, if ρ


4.2 - Proof of Theorem 4.1 and Theorem 4.2 105<br />

4.2.2 Nekhoroshev’s estimates and proof of Theorem 4.2<br />

4.2.2.1. Here we consider the Hamiltonian<br />

<br />

H(z) = h( Ĩ) + f(z)<br />

τ ′<br />

H ∈ As, h ∈ SDMγ ′ , |f|s < ε<br />

and we have assumed that, on the real part of the domain, the derivatives up to order 3<br />

τ ′<br />

of h are uniformly bounded by some constant M > 1. The definition of the set SDMγ ′<br />

is recalled in Appendix 4.A.<br />

Theorem 4.3. Let H be as in (E), with τ ′ ≥ 2 and γ ′ ≤ 1. Then there exists ε0 such<br />

that if ε ≤ ε0, for every solution z(t) with | Ĩ(0)| < 1, we have<br />

| Ĩ(t) − Ĩ(0)| < (n2 + 1)ε b′<br />

, |t| < exp(ε −a′<br />

),<br />

with the exponents a ′ = b ′ = 3 −1 (2(n + 1)τ ′ ) −n .<br />

Theorem 4.2 is now an immediate consequence of this result and Theorem 4.11 (see<br />

Appendix 4.A).<br />

Proof of Theorem 4.2. From Theorem 4.11, we know that for almost all β ∈ Sn(R), the<br />

′<br />

Hamiltonian h( Ĩ) = α.Ĩ +βĨ.Ĩ belongs to SDMτ (B) with τ ′ > n2 +1. So we can apply<br />

Theorem 4.3: for every solution z(t) with | Ĩ(0)| < 1, we have<br />

| Ĩ(t) − Ĩ(0)| < (n2 + 1)ε b′<br />

, |t| < exp(ε −a′<br />

),<br />

with the exponents a ′ = b ′ = 3 −1 (2(n + 1)τ ′ ) −n . In particular, this gives<br />

for every solution z(t) with | Ĩ(0)|1 < 1.<br />

| Ĩ(t) − Ĩ(0)|1 < n(n 2 + 1)ε b′<br />

, |t| < exp(ε −a′<br />

),<br />

4.2.2.2. The statement of Theorem 4.3 is the analogue of Theorem 3.6. However,<br />

the difference is that here we are using Cartesian coordinates and not action-angle<br />

coordinates (i.e. symplectic polar coordinates), and we cannot use the latter since they<br />

become singular at the origin. So we cannot apply directly Theorem 3.6. This is not<br />

a serious issue when applying KAM theory in this context (see [Arn63b] or [Pös82] for<br />

example), but this becomes problematic in Nekhoroshev theory (see [Loc92] or [Loc95]<br />

for detailed explanations). This result was only conjectured by Nekhoroshev in [Nek77],<br />

and it took a long time before it could be solved in the convex case ([Nie98],[FGB98]).<br />

Here we are able to solve this problem in the generic case. The reason is that even<br />

though we cannot apply the result of the first chapter, we can use exactly the same<br />

approach, since the method of averaging along unperturbed periodic flows is intrinsic,<br />

i.e. independent of the choice of coordinates, a fact that was first used implicitly in<br />

[Nie98] and made completely clear in [Pös99b].<br />

The proof of such estimates usually requires an analytic part, which boils down to<br />

the construction of suitable normal forms, and a geometric part. The geometric part of<br />

(E)


106 Generic super-exponential stability for invariant tori<br />

the first chapter goes exactly the same way, so in the sequel we will restrict ourselves to<br />

indicating the very slight modifications in the construction of the normal forms.<br />

4.2.2.3. Consider linearly independent periodic vectors ω1, . . .,ωn of R n , with periods<br />

(T1, . . .,Tn), that is<br />

Define the domains<br />

Tj = inf{t > 0 | tωi ∈ Z n }, 1 ≤ j ≤ n.<br />

Drj,sj (ωj) = {z ∈ Dsj | |∇h(Ĩ) − ωj|


4.2 - Proof of Theorem 4.1 and Theorem 4.2 107<br />

The proof is completely analogous to the proof of the Proposition 3.6, Appendix 3.A,<br />

to which we refer for more details. In fact, here the proof is even simpler since one does<br />

not have to use "weighted" norms for vector fields. It relies on a finite composition of<br />

averaging along the periodic flows generated by lj, j ∈ {1, . . ., n}. The case j = 1 is due<br />

to Pöschel ([Pös99b]) and, for j > 1, the proof goes by induction using our assumption<br />

(Aj), j ∈ {1, . . ., n}.<br />

Once we have this normal form, the rest of the proof of Theorem 3.6 goes exactly<br />

the same way: every solution z(t) of H with | Ĩ(0)| < 1 satisfies<br />

| Ĩ(t) − Ĩ(0)| < (n2 + 1)ε b′<br />

, |t| < exp(ε −a′<br />

),<br />

provided that ε ≤ ε0, with ε0 depending on n, s, M, γ ′ and τ ′ and with the exponents<br />

4.2.3 Proof of Theorem 4.1<br />

a ′ = b ′ = 3 −1 (2(n + 1)τ ′ ) −n .<br />

Now we can finally prove Theorem 4.1, by using successively Birkhoff’s estimates and<br />

Nekhoroshev’s estimates.<br />

Proof of Theorem 4.1. Let H be as in (A), first assume that ρ < ρ1 with ρ1 =·γ so<br />

that using our assumption (G1) we can apply Proposition 4.1: there exist an integer<br />

m = m(ρ) and an analytic symplectic transformation<br />

such that<br />

Φm : D3s/4 → Ds<br />

H ◦ Φm(z) = hm( Ĩ) + fm(z)<br />

is in Birkhoff normal form, with a remainder fm satisfying the estimate<br />

<br />

|fm|3s/4


108 Generic super-exponential stability for invariant tori<br />

However, one has<br />

ρ b′<br />

exp −b ′ (γρ −1 ) a < γ −1 ρ,<br />

and as Φm satisfies |Φm − Id|3s/4


4.3 - Further results and comments 109<br />

However, it is important to note that one cannot obtain a statement similar to<br />

Theorem 4.2, simply because in this case a non-resonant condition up to a finite order<br />

does not allow to build the corresponding Birkhoff normal form.<br />

If we compare this result with [MG95], our assumption is generic and we do not<br />

require any convexity. But of course the price to pay is that one has to consider the full<br />

set of Birkhoff invariants.<br />

4.3.0.2. As a final result, one can also obtain similar estimates for the general case of<br />

a linearly stable lower-dimensional torus, under the common assumptions of isotropicity<br />

and reducibility (which were automatic for a fixed point or a Lagrangian torus). In that<br />

context, it is enough to consider a Hamiltonian defined in Tk ×Rk ×R2l (by isotropicity),<br />

of the form<br />

H(θ, I, z) = ω.I + 1<br />

Bz.z + F(θ, I, z).<br />

2<br />

Here B is a symmetric matrix (constant by reducibility) such that J2lB has a<br />

purely imaginary spectrum (J2l being the canonical symplectic structure of R 2l ), and<br />

F(θ, I, z) = O(|I| 2 , ||z|| 3 ). In those coordinates, the invariant torus is simply given by<br />

I = 0, z = 0, and this generalizes both the case of an elliptic fixed point (where the<br />

directions (θ, I) are absent) and of a Lagrangian invariant torus (where the directions z<br />

are absent). If the spectrum {±iα1, . . ., ±iαl} of J2lB is simple, one can assume further<br />

that<br />

H(θ, I, z) = ω.I + α. Ĩ + F(θ, I, z),<br />

where Ĩ are the "formal actions" associated to the z variables. Therefore, after some<br />

appropriate scalings we can consider<br />

<br />

H(θ, I, z) = ω.I + α. Ĩ + f(θ, I, z)<br />

(D)<br />

H ∈ As, |f|s < ρ<br />

where As is the space of holomorphic functions on the domain<br />

Ds = {(θ, I, z) ∈ (C k /Z k ) × C k × C 2l | |I(θ)| < s, |I| < s, ||z|| < s}.<br />

Under a suitable Diophantine condition on the vector (ω, α) ∈ Rk+l , one can define<br />

polynomials hm and a formal series h∞ depending on J = (I, Ĩ). Birkhoff’s exponential<br />

estimates in this more difficult situation have been obtained in [JV97]. Regarding<br />

Nekhoroshev’s estimates for a generic integrable Hamiltonian which depends both on<br />

actions and formal actions, they can be easily obtained by obvious modifications of our<br />

method. Therefore we can state the following result.<br />

Theorem 4.5. Suppose H is as in (D). Then under a generic condition on h∞,<br />

there exist positive constants a, a ′ , c1, c2 and ρ0 such that for ρ ≤ ρ0, every solution<br />

(θ(t), I(t), z(t)) of H with |J(0)| < 1 satisfies<br />

|J(t) − J(0)| < c1ρ, |t| < exp<br />

<br />

ρ −a′<br />

exp(c2a ′ ρ −a <br />

) .<br />

Once again, the condition on h∞ and the values of the exponents are the same.<br />

4.3.0.3. Let us add that one could easily give similar estimates in the discrete case,<br />

that is for exact symplectic diffeomorphisms near an elliptic fixed point, an invariant


110 Generic super-exponential stability for invariant tori<br />

Lagrangian torus or an invariant linearly stable isotropic reducible torus. Even if one has<br />

the possibility to re-write the proof in these settings, the easiest way is to use suspension<br />

arguments, as it is done qualitatively in [Dou88] or quantitatively in [KP94] (see also<br />

[PT97] for a different approach) and deduce stability results in the discrete case from<br />

the corresponding results in the continuous case.<br />

To conclude, let us mention that important examples of invariant tori satisfying<br />

our assumptions (linearly stable, reducible, isotropic) are those given by KAM theory.<br />

However, the latter not only gives individual tori but a whole Cantor family (see [Pös01]<br />

or [AKN06]). In this context, Popov has proved exponential stability estimates for the<br />

family of Lagrangian KAM tori, if the Hamiltonian is analytic or Gevrey ([Pop00] and<br />

[Pop04]). His proof relies on a KAM theorem with Gevrey smoothness on the parameters<br />

(in the sense of Whitney) and some kind of simultaneous Birkhoff normal form over the<br />

Cantor set of tori. We believe that our method should be useful in trying to extend<br />

those results to obtain super-exponential stability under generic conditions. But clearly<br />

this is a more difficult problem, and the first step is to obtain Nekhoroshev’s estimates<br />

in Gevrey regularity for a generic integrable Hamiltonian, the quasi-convex case having<br />

been settled in [MS02].<br />

4.A Generic assumptions<br />

In this appendix, we will show that our assumption (G) is generic, in the sense that it<br />

defines a prevalent set in the infinite dimensional space of formal power series.<br />

4.A.0.1. Let us first recall the definition of Simultaneous Diophantine Morse functions<br />

(SDM in the following). Let G(n, k) be the set of all vector subspaces of R n of dimension<br />

k. We endow R n with the Euclidean scalar product, and given an integer L ∈ N ∗ ,<br />

we define G L (n, k) as the subset of G(n, k) consisting of subspaces whose orthogonal<br />

complement can be generated by integer vectors with components bounded by L. In<br />

the sequel, B will be an arbitrary open ball of R n .<br />

Definition 4.6. A smooth function h : B → R is said to be SDM if there exist γ ′ > 0<br />

and τ ′ ≥ 0 such that for any L ∈ N ∗ , any k ∈ {1, . . .,n} and any Λ ∈ G L (n, k), there<br />

exists (e1, . . .,ek) (resp. (f1, . . .,fn−k)), an orthonormal basis of Λ (resp. of Λ ⊥ ), such<br />

that the function hΛ defined on B by<br />

hΛ(u, v) = h (u1e1 + · · · + ukek + v1f1 + · · · + vn−kfn−k)<br />

satisfies the following: for any (u, v) ∈ B,<br />

for any η ∈ R k \ {0}.<br />

∂uhΛ(u, v) ≤ γ ′ −τ ′<br />

L =⇒ ∂uuhΛ(u, v).η > γ ′ −τ ′<br />

L η,<br />

This definition is inspired by the steepness condition of Nekhoroshev and the quantitative<br />

Morse-Sard theory of Yomdin (see the first chapter for more explanations). It<br />

depends on a choice of coordinates adapted to the orthogonal decomposition Λ ⊕ Λ ⊥ ,<br />

so for Λ ∈ G L (n, k) and (u, v) ∈ B, ∂uhΛ(u, v) is a vector in R k and ∂uuhΛ(u, v) is a<br />

symmetric matrix of size k with real entries.


4.A - Generic assumptions 111<br />

Remark 4.7. Note also that the definition can be stated as the following alternative:<br />

for any (u, v) ∈ B, either we have ∂uhΛ(u, v) > γL −τ or ∂uuhΛ(u, v).η > γL −τ η<br />

for any η ∈ R k \ {0}. Hence for a given function it is sufficient to verify that<br />

∂uuhΛ(u, v).η > γL −τ η for any η ∈ R k \ {0}, and we will use this fact later (in<br />

Theorem 4.11).<br />

4.A.0.2. The set of SDM functions on B with respect to γ ′ > 0 and τ ′ ≥ 0 will be<br />

τ ′<br />

denoted by SDMγ ′ (B), and we will also use the notation<br />

τ ′<br />

SDM (B) = <br />

γ ′ >0<br />

SDM<br />

τ ′<br />

γ<br />

′ (B).<br />

The following theorem was proved in the first chapter, and it relies on non trivial results<br />

from quantitative Morse-Sard theory ([Yom83],[YC04]).<br />

Proposition 4.3. Let τ > 2(n2 + 1), and h ∈ C2n+2 (B). Then for Lebesgue almost all<br />

τ ′<br />

ξ ∈ R n , the function hξ, defined by hξ(I) = h(I) −ξ.I for I ∈ B, belongs to SDM<br />

Now let us recall the definition of a prevalent set ([HSY92], see also [KH08] and<br />

[OY05]).<br />

Definition 4.8. Let E be a completely metrizable topological vector space. A Borel<br />

subset S ⊆ E is said to be shy if there exists a Borel measure µ on E, with 0 < µ(C) < ∞<br />

for some compact set C ⊆ E, such that µ(x + S) = 0 for all x ∈ E.<br />

An arbitrary set is called shy if it is contained in a shy Borel subset, and finally the<br />

complement of a shy set is called prevalent.<br />

For a finite dimensional vector space E, by an easy application of Fubini theorem,<br />

prevalence is equivalent to full Lebesgue measure. The following "genericity" properties<br />

can be checked ([OY05]): a prevalent set is dense, a set containing a prevalent set is also<br />

prevalent, and prevalent sets are stable under translation and countable intersection.<br />

Furthermore, we have an easy but useful criterion for a set to be prevalent.<br />

Proposition 4.4 ([HSY92]). Let A be a Borel subset of E. Suppose there exists a finite<br />

dimensional subspace F of E such that, denoting λF the Lebesgue measure supported on<br />

F, the set x + A has full λF-measure for all x ∈ E. Then A is prevalent.<br />

τ ′<br />

It is an obvious consequence of Proposition 4.3 and Proposition 4.4 that SDM (B)<br />

is prevalent in C2n+2 (B) for τ ′ > 2(n2 + 1).<br />

4.A.0.3. Now let P∞ = R[[X1, . . .,Xn]] be the space of all formal power series in n<br />

variables with real coefficients. It is naturally a Fréchet space, as the projective limit of<br />

the finite dimensional spaces Pm consisting of polynomials in n variables of degree less<br />

than or equal to m. We define the subset<br />

τ ′<br />

S∞ = {h∞<br />

τ ′<br />

∈ P∞ | hm ∈ SDM (B), ∀m ≥ 2},<br />

where hm = m<br />

k=1 hk if h∞ = <br />

k≥1 hk , and we identify the polynomial hm with the<br />

associated function defined on B. Let us also define<br />

D τ ∞ = {h∞ ∈ P∞ | h1(X) = α.X, α ∈ D τ },<br />

(B).


112 Generic super-exponential stability for invariant tori<br />

where D τ is the set of Diophantine vectors of R n with exponent τ, and finally<br />

τ,τ ′<br />

G∞ = Dτ ′<br />

∞ ∩ Sτ ∞ .<br />

τ,τ ′<br />

The set G∞ is the set of formal power series for which condition (G) holds.<br />

Theorem 4.9. For τ > n − 1 and τ ′ > 2(n2 τ,τ ′<br />

+ 1), the set G∞ is prevalent in P∞.<br />

Proof. As the intersection of two prevalent sets is prevalent, it is enough to prove that<br />

both sets Dτ ′<br />

∞ , for τ > n − 1, and Sτ ∞ , for τ ′ > 2(n2 + 1), are prevalent.<br />

For the set Dτ ∞, this is an easy consequence of the fact that Dτ is of full Lebesgue<br />

measure in Rn , for τ > n − 1, and Proposition 4.4 with F = P1, the space of linear<br />

τ ′<br />

forms. For the set S , first note that we can write<br />

∞<br />

where, for an integer m ≥ 2,<br />

τ ′<br />

S∞ = <br />

m≥2<br />

τ ′<br />

S∞,m, τ ′<br />

S∞,m = {h∞<br />

τ ′<br />

∈ P∞ | hm ∈ SDM (B)}.<br />

As a countable intersection of prevalent sets is prevalent, it is enough to prove that for<br />

τ ′<br />

each m ≥ 2, the set S∞,m is prevalent in P∞. But once again this is just a consequence<br />

of Proposition 4.3 and Proposition 4.4 with F = P1 the space of linear forms.<br />

by<br />

For m ≥ 2, the set of polynomials hm for which condition (Gm) is satisfied is given<br />

τ ′<br />

τ ′<br />

Sm = {hm ∈ Pm | hm ∈ SDM (B)},<br />

and the proof of the above theorem immediately gives the following result.<br />

Theorem 4.10. For τ ′ > 2(n2 τ ′<br />

+ 1), the set Sm is of full Lebesgue measure in Pm.<br />

4.A.0.4. Now in the special case m = 2, we can state a refined result which is due to<br />

Niederman ([Nie07a]).<br />

Theorem 4.11. For Lebesgue almost all β ∈ Sn(R), the function<br />

h(I) = α.I + βI.I<br />

τ ′<br />

belongs to SDM (B) provided τ ′ > n2 + 1.<br />

In the above theorem, there is no condition on α, and contrary to Proposition 4.3,<br />

the proof does not rely on Morse-Sard theory. Let us denote by λ the one-dimensional<br />

Lebesgue measure and by Ik the identity matrix of size k. We shall use the following<br />

elementary lemma.<br />

Lemma 4.12. Let k ∈ {1, . . ., n}, βk ∈ Sk(R) and κ > 0. Then there exists a subset<br />

Cκ ⊆ R such that<br />

λ(Cκ) ≤ 2kκ,<br />

and for any ξ /∈ Cκ, the matrix βk,ξ = βk − ξIk satisfies<br />

for any η ∈ R k \ {0}.<br />

βk,ξ.η > κη,


4.A - Generic assumptions 113<br />

Of course, the set Cκ depends on the matrix βk.<br />

Proof. Let {λ1, . . .,λk} be the eigenvalues of βk, then in an orthonormal basis of eigenvectors<br />

for βk, the matrix βk,ξ is also diagonal, with eigenvalues {λ1 − ξ, . . ., λk − ξ}.<br />

Then one has βk,ξ.η > κη for any η ∈ R k \ {0} provided that for all i ∈ {1, . . .,k},<br />

|λi − ξ| > κ, that is if ξ does not belong to<br />

Cκ =<br />

k<br />

[λi − κ, λi + κ].<br />

i=1<br />

The measure estimate λ(Cκ) ≤ 2kκ is trivial.<br />

With this lemma, the proof is now similar to that of Proposition 4.3.<br />

Proof of Theorem 4.11. Let h(I) = α.I + βI.I, and given Λ ∈ GL (n, k), we denote<br />

by βΛ ∈ Sk(R) the matrix which represents the quadratic form βI.I restricted to the<br />

subspace Λ. Since the second derivative of h along any subspace is constant, then coming<br />

τ ′<br />

back to definition 4.6 and using remark 4.7, h ∈ SDM ′ if<br />

γ<br />

βΛ.η > γ ′ −τ ′<br />

L η, (36)<br />

for any Λ ∈ GL (n, k) and any η ∈ Rk τ ′<br />

\ {0}. Let Aγ ′ be the subset of Sn(R) whose<br />

elements contradict condition (36), that is<br />

and<br />

τ ′<br />

Aγ ′ = {β ∈ Sn(R) | βΛ.η ≤ γ ′ τ ′<br />

L η, Λ ∈ G L (n, k), η ∈ R k \ {0}},<br />

τ ′<br />

What we need to show is that A<br />

n2 + 1.<br />

τ ′<br />

A = <br />

γ ′ >0<br />

τ ′<br />

Aγ ′.<br />

has zero Lebesgue measure in Sn(R) provided τ ′ ><br />

Apply Lemma 4.12 to βΛ ∈ Sk(R), with κ = γ ′ −τ ′<br />

L , to have a subset Cγ ′ ,τ ′ ,L,Λ ⊆ R<br />

such that<br />

λ(Cγ ′ ,τ ′ ,L,Λ) ≤ 2kγ ′ −τ ′<br />

L , (37)<br />

and for any ξ /∈ Cγ ′ ,τ ′ ,L,Λ, the matrix βΛ,ξ = βΛ − ξIk satisfies<br />

for any η ∈ R k \ {0}. If we define<br />

then<br />

and so<br />

Cγ ′ <br />

,τ ′ =<br />

βΛ,ξ.η > γ ′ −τ ′<br />

L η<br />

<br />

<br />

L∈N∗ k∈{1,...,n} Λ∈GL (n,k)<br />

Cγ ′ ,τ ′ ,L,Λ,<br />

Cγ ′ ,τ ′ = {ξ ∈ R | βξ<br />

τ ′<br />

= β − ξIn ∈ Aγ ′}<br />

<br />

Cτ ′ =<br />

γ ′ >0<br />

Cγ ′ ,τ ′ = {ξ ∈ R | βξ<br />

τ ′<br />

∈ A }.


114 Generic super-exponential stability for invariant tori<br />

It remains to prove that the Lebesgue measure of Cτ ′ is zero, since by Fubini theorem,<br />

τ ′<br />

this will imply that the Lebesgue measure of A is zero. By our estimate (37), we have<br />

λ(Cγ ′ <br />

,τ ′) ≤<br />

n<br />

L∈N∗ k=1<br />

≤ <br />

n<br />

L∈N∗ k=1<br />

|G L (n, k)|2kγ ′ −τ ′<br />

L<br />

L n2<br />

2kγ ′ −τ ′<br />

L<br />

<br />

n<br />

<br />

<br />

= 2 k L n2−τ ′<br />

<br />

γ ′<br />

k=1<br />

L∈N ∗<br />

<br />

<br />

= n(n + 1) L n2−τ ′<br />

<br />

γ ′<br />

L∈N ∗<br />

and, since τ ′ > n 2 + 1, the above series is convergent. Hence<br />

λ(Cτ ′) = inf<br />

γ ′ >0 λ(Cγ ′ ,τ ′) = 0.


5.1 - Introduction 115<br />

5 Polynomial stability for C k quasi-convex Hamiltonian<br />

systems<br />

A major result about perturbations of integrable Hamiltonian systems is the Nekhoroshev<br />

theorem, which gives exponential stability for all solutions provided the system is<br />

analytic and the integrable Hamiltonian is generic. In the particular but important case<br />

where the latter is quasi-convex, these exponential estimates have been generalized by<br />

Marco and Sauzin if the Hamiltonian is Gevrey regular, using a method introduced by<br />

Lochak in the analytic case. In this chapter, using the same approach, we investigate<br />

the situation where the Hamiltonian is assumed to be only finitely differentiable, for<br />

which it is known that exponential stability does not hold but nevertheless we prove<br />

estimates of polynomial stability. The results of this chapter are contained in [Bou10b].<br />

5.1 Introduction<br />

In this chapter, we are concerned with the stability properties of near-integrable Hamiltonian<br />

systems of the form<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| < ε 3, the famous example of Arnold ([Arn64]) shows that there exist<br />

"unstable" solutions, along which the variation of the actions can be arbitrarily large no<br />

matter how small the perturbation is. From its very beginning, KAM theory was known<br />

to hold for non-analytic Hamiltonians (see [Mos62] in the context of twist maps). It<br />

is now well established in various regularity classes, including the C ∞ case (essentially<br />

by Herman, see [Bos86] and [Féj04]) and the Gevrey case ([Pop04]). Following ideas<br />

of Moser, the theorem also holds if H is only of class C k , with k > 2n (see [Pös82],<br />

[Sal04], [SZ89] and also [Alb07] for a refinement), even though the minimal number of<br />

derivatives is still an open question, except in a special case for n = 2 ([Her86], see also<br />

[KT09] for the related context of linearization of circle diffeomorphisms).<br />

5.1.0.2. Another fundamental result, which complements KAM theory, is given by<br />

Nekhoroshev’s theorem ([Nek77], [Nek79]). If the integrable part h satisfies some generic


116 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

condition and the system is analytic, then for ε sufficiently small there exist positive<br />

constants c1, c2, c3, a and b such that<br />

|I(t) − I0| ≤ c1ε b , |t| ≤ c2 exp(c3ε −a ),<br />

for all initial actions I0. Hence all solutions are stable, not for all time, but for an<br />

exponentially long time. In the special case where h is strictly quasi-convex, a completely<br />

new proof of these estimates was given by Lochak ([Loc92]) using periodic averaging<br />

and simultaneous Diophantine approximation. The method of Lochak has had many<br />

applications, in particular it was used by Marco and Sauzin to extend Nekhoroshev’s<br />

theorem to the Gevrey regular case under the quasi-convexity assumption ([MS02]).<br />

5.1.0.3. However, no such estimates have been studied when the Hamiltonian is merely<br />

finitely differentiable, and this is the content of the present chapter. We will prove below<br />

(Theorem 5.1) that if H is of class Ck , for k ≥ 2, and h quasi-convex, then one has the<br />

stability estimates<br />

|I(t) − I0| ≤ c1ε 1<br />

k−2<br />

2n,<br />

−<br />

|t| ≤ c2ε 2n ,<br />

for some positive constants c1 and c2, and provided that ε is small enough. Of course,<br />

under our regularity assumption the exponential estimates have been replaced with polynomial<br />

estimates, and earlier examples show that exponential stability cannot possibly<br />

hold under such a weak regularity assumption (this is discussed in [MS04]). The proof<br />

will use once again the ideas of Lochak which, among other things, reduces the analytic<br />

part to its minimum and we will also follow the implementation of Marco and Sauzin<br />

in the Gevrey case.<br />

5.1.0.4. As we recalled above, KAM theory for finitely differentiable Hamiltonian<br />

systems has been widely studied, and so we believe that Nekhoroshev’s estimates under<br />

weaker regularity assumptions have their own interest. Moreover, for obvious reasons,<br />

examples of unstable solutions (so-called Arnold diffusion) are more easily constructed<br />

in the non-analytic case, and it is a natural question to estimate the time of instability<br />

(see [KL08a] and [KL08b] for examples of class C k with a polynomial time of diffusion).<br />

Finally, one of our motivations is to generalize these estimates using the method of the<br />

first chapter, where Lochak’s ideas are extended to deal with analytic but C k -generic<br />

unperturbed Hamiltonians, with k > 2n + 2.<br />

5.2 Main result<br />

5.2.0.1. Let T n = R n /Z n , and consider a Hamiltonian function H defined on the<br />

domain<br />

DR = T n × BR,<br />

where BR is the open ball of R n around the origin of radius R, with respect to the<br />

supremum norm | . |. As usual, we shall occasionally identify H with a function defined<br />

on R n × BR which is 1-periodic with respect to the first n variables.<br />

We assume that H is of class C k , for an integer k ≥ 2, i.e. it is k-times differentiable<br />

and all its derivatives up to order k extend continuously to the closure DR. We denote


5.2 - Main result 117<br />

by Ck (DR) the space of such functions, which is a Banach space with the norm<br />

<br />

|H| Ck (DR) = sup sup<br />

0≤l≤k |α|=l<br />

sup<br />

x∈DR<br />

|∂ α H(x)|<br />

where x = (θ, I), α = (α1, . . ., α2n) ∈ N 2n , |α| = α1 + · · · + α2n and<br />

∂ α = ∂ α1<br />

1 . . .∂ α2n<br />

2n .<br />

In the case where the Hamiltonian H = h depends only on the action variables, we will<br />

simply write |h| C k (BR).<br />

Our Hamiltonian H ∈ Ck (DR) is assumed to be Ck-close to integrable, that is, of<br />

the form <br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| Ck (DR) < ε


118 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

know what the optimal exponents should be. However, using the geometric arguments<br />

of the next chapter, one can easily improve the stability exponent a in order to obtain<br />

for δ > 0 but arbitrarily small.<br />

a =<br />

k − 2<br />

− δ,<br />

2(n − 1)<br />

Let us finally point out that if H is C ∞ , then it is an immediate consequence of the<br />

above result that the action variables are stable for an interval of time which is longer<br />

than any prescribed power of ε −1 , but even in this case exponential stability does not<br />

hold.<br />

5.2.0.3. As in the analytic or Gevrey case, we can also state a refined result near<br />

resonances. Suppose Λ is a submodule of Z n of rank m, d = n − m and let SΛ be the<br />

corresponding resonant manifold, that is<br />

SΛ = {I ∈ BR | k.∇h(I) = 0, k ∈ Λ}.<br />

We can prove the following statement, which actually contains the previous one.<br />

Theorem 5.2. Under the previous hypotheses, assume d(I(0), SΛ) ≤ σ √ ε for some<br />

constant σ > 0, and set<br />

k − 2<br />

ad =<br />

2d , bd = 1<br />

2d .<br />

Then there exist ε ′ 0 , c′ 1 and c′ 2 such that if ε ≤ ε′ 0<br />

, one has<br />

|I(t) − I(0)| ≤ c ′ 1 εbd , |t| ≤ c ′ 2 ε −ad .<br />

For Λ = {0}, d = n and SΛ = BR/2, we recover Theorem 5.1 and therefore it will be<br />

enough to prove Theorem 5.2.<br />

5.2.0.4. The constants ε0, c1 and c2 depend only on h, more precisely they depend on<br />

k, n, R, M and m while the constants ε ′ 0 , c′ 1 and c′ 2 also depend on σ and Λ. However we<br />

will not give explicit values for them in order to avoid complicated and rather useless<br />

expressions. Hence we shall replace them by the symbol · when it is convenient: for<br />

instance, we shall write u


5.3 - Analytical part 119<br />

is given by those harmonics associated with integers k ∈ Z n in resonance with ω, that<br />

is such that k.ω = 0. Actually one can construct a symplectic, close-to-identity transformation<br />

Φ defined around I, such that<br />

H ◦ Φ = h + g + ˜ f<br />

where g contains only harmonics in resonance with ω and ˜ f is a small remainder. These<br />

are usually called resonant normal forms, and to obtain them one has to deal with<br />

small divisors k.ω which involve technical estimates. If the system is analytic, the above<br />

remainder ˜ f can be made exponentially small with respect to the inverse of the size of the<br />

perturbation, as was first shown by Nekhoroshev. But for finitely differentiable systems<br />

one might guess that the remainder can only be polynomially small, even though this<br />

should be difficult (or at least technical) to prove using the usual approach.<br />

5.3.0.2. It is a remarkable fact discovered by Lochak ([Loc92]) that to prove exponential<br />

estimates in the quasi-convex case with the analyticity assumption, it is enough to<br />

average along periodic frequencies, which are frequencies ω such that Tω ∈ Z n \ {0}<br />

for some T > 0 (see also the first chapter for an extension of this method for generic<br />

integrable Hamiltonians). These periodic frequencies correspond to periodic orbits of<br />

the unperturbed Hamiltonian, hence in this approach no small divisors arise. As a<br />

consequence this special resonant normal form is much easier to obtain. The aim of this<br />

section is to construct such a normal form, up to a polynomial remainder. This will be<br />

done in 5.3.3. But first we will recall some useful estimates concerning the C k norm<br />

in 5.3.1, and then prove an intermediate statement in 5.3.2.<br />

5.3.1 Elementary estimates<br />

5.3.1.1. Let us begin by recalling some easy estimates. Given two functions f, g ∈<br />

C k (DR), the product fg belongs to C k (DR) and by the Leibniz rule<br />

|fg| C k (DR)


120 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

where<br />

∂If = (∂I1f, . . . , ∂Inf), ∂θf = (∂θ1f, . . . , ∂θnf).<br />

Obviously Xf ∈ C k−1 (DR, R 2n ), and trivially<br />

|Xf| C k−1 (DR) ≤ |f| C k (DR).<br />

Moreover, by classical theorems on ordinary differential equations, if Xf is of class<br />

C k−1 then so is the time-t map Φ f<br />

t of the vector field Xf, when it exists. Assuming<br />

|∂θf|C 0 (DR) < r for some r < R (for example |Xf|C 0 (DR) < r), then by the mean value<br />

theorem<br />

Φ f = Φ f<br />

1 : DR−r −→ DR<br />

is a well-defined C k−1 -embedding. In the case where f is integrable, one can choose<br />

r = 0.<br />

In the sequel, we will need to estimate the C k norm of Φ f in terms of the C k norm of<br />

the vector field Xf. More precisely we need the rather natural fact that Φ f is C k -close<br />

to the identity when Xf is C k -close to zero. This is trivial for k = 0. In the general<br />

case, this follows by induction on k using on the one hand the relation<br />

Φ f<br />

t = Id +<br />

t<br />

0<br />

Xf ◦ Φ f sds,<br />

and on the other the formula of Faà di Bruno (see [AR67] for example), which gives<br />

bounds of the form<br />

and also<br />

|F ◦ G| C k


5.3 - Analytical part 121<br />

5.3.2 The linear case<br />

Following [MS02], we change for a moment our setting and we consider a perturbation<br />

of a linear Hamiltonian, more precisely the Hamiltonian<br />

<br />

H(θ, I) = l(I) + f(θ, I)<br />

(∗∗)<br />

|f| Ck (Dρ) < µ 0 is fixed and l(I) = ω.I is a linear Hamiltonian with a T-periodic frequency<br />

ω. Recall that this means that<br />

T = inf{t > 0 | tω ∈ Z n \ {0}}<br />

is well-defined. In this context, our small parameter is µ.<br />

In the proposition below, we will construct a "global" normal form for the Hamiltonian<br />

(∗∗), which we will use in the next section to produce a "local" normal form<br />

around periodic orbits for our original Hamiltonian (∗).<br />

Proposition 5.1. Consider H as in (∗∗) with k ≥ 2, and assume<br />

Then there exists a C 2 symplectic transformation<br />

with |Φ − Id| C 2 (Dρ/2)


122 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

and for j ∈ {0, . . ., k − 2}, let<br />

ρj = ρ − jr ≥ ρ/2.<br />

Then we claim that for any j ∈ {0, . . ., k − 2}, there exists a C k−j symplectic transformation<br />

Φj : Dρj → Dρ with |Φj − Id| C k−j (Dρj )


5.3 - Analytical part 123<br />

Indeed, thanks to the condition Tµ


124 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

5.3.3 Normal form<br />

Now let us come back to our original setting which is the Hamiltonian<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| C k (DR) < ε 0 be such that<br />

Then there exists a C 2 symplectic transformation<br />

with |ΠIΦ − IdI| C 0 (B(I∗,µ))


5.3 - Analytical part 125<br />

which sends the domain D2 = T n × B2 onto T n × B(I∗, 2µ), and note that by the<br />

condition µ


126 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

Observe that (µl + µ 2 hµ) ◦ σ −1<br />

µ = h, so we may set<br />

and write<br />

It is obvious that {g, l} = 0 with<br />

and similarly<br />

so<br />

Moreover, as ∂˜ θ ˜ f = µ∂˜ θ ˜ fµ then<br />

and finally<br />

is trivial. This ends the proof.<br />

5.4 Proof of Theorem 5.2<br />

g = µ ˆ fµ ◦ σ −1<br />

µ , ˜ f = µ ˜ fµ ◦ σ −1<br />

µ ,<br />

H ◦ Φ = h + g + ˜ f.<br />

|g| C 0 (T n ×B(I∗,µ)) ≤ µ| ˆ fµ| C 0 (D1)


5.4 - Proof of Theorem 5.2 127<br />

and the period T satisfies<br />

1 0 and<br />

˜H = h + g + ˜ f ∈ C 2 (T n × B(I∗, r))<br />

with h satisfying (C), {g, l} = 0 and the estimates<br />

If<br />

|g + ˜ f|C 0 (T n ×B(I∗,r)) < r 2 , |∂˜ θ ˜ f|C 0 (T n ×B(I∗,r)) < r 2 τ −1 .<br />

r


128 Polynomial stability for C k quasi-convex Hamiltonian systems<br />

and the period T trivially satisfies T 1, we apply Proposition 5.3<br />

with<br />

d−1<br />

−<br />

Q =· ε 2d .<br />

1<br />

−<br />

and the condition (43) gives a first smallness condition on ε. Observe that Q d−1 =· ε 1<br />

2d,<br />

hence the periodic action I∗ given by the proposition satisfies<br />

<br />

|I0 − I∗|


Part III<br />

From stability to instability<br />

Summary<br />

6 Improved exponential stability for quasi-convex Hamiltonian systems<br />

131<br />

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131<br />

6.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133<br />

6.3 The analytic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

6.4 The Gevrey case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143<br />

129


130


6.1 - Introduction 131<br />

6 Improved exponential stability for quasi-convex<br />

Hamiltonian systems<br />

In this chapter, we improve previous results on exponential stability for analytic and<br />

Gevrey perturbations of quasi-convex integrable Hamiltonian systems. In particular,<br />

this provides a sharper lower bound on the time of Arnold diffusion which we believe to<br />

be optimal. This chapter is based on [BM10].<br />

6.1 Introduction<br />

6.1.0.1. This chapter deals with some stability properties of near-integrable Hamiltonian<br />

systems of the form <br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| < ε 0, no matter how small the perturbation is. Such instability<br />

is commonly referred to as Arnold diffusion. Obviously Nekhoroshev’s estimates give a<br />

lower bound on the diffusion time τ(ε) (or equivalently an upper bound on the diffusion<br />

speed) which is exponentially large (or exponentially small when referring to the rate of<br />

diffusion).<br />

6.1.0.2. This chapter is concerned with the precise time scale at which stability breaks<br />

down and instability takes place, by which we mean the precise value of the exponent


132 Improved exponential stability for quasi-convex Hamiltonian systems<br />

a, in the special case where the unperturbed Hamiltonian h is quasi-convex (that is, its<br />

energy sub-levels are convex).<br />

The quasi-convex case, which is of both practical and theoretical interest, has been<br />

widely studied in Nekhoroshev theory, essentially for two reasons. First, the proof<br />

is much easier in this situation and a more refined result is available: the stability<br />

exponents may be chosen as<br />

a = b = (2n) −1 .<br />

These facts are best illustrated by the striking proof given by Lochak ([Loc92], see also<br />

[LN92] and [LNN94]), though they are also accessible via a more traditional approach<br />

as was shown by Pöschel ([Pös93]). Note also that these values have been generalized by<br />

Niederman in the steep case ([Nie04]) using both ideas of Lochak and Pöschel. Yet there<br />

is another reason for which the quasi-convex case is interesting, which is the so-called<br />

stabilization by resonances: if a solution starts close to a resonance of multiplicity m,<br />

m < n, then it possesses better stability properties, described by the "local" exponents<br />

am = bm = (2(n − m)) −1 .<br />

This is a quite surprising fact, as it shows that even though resonances are the main<br />

cause of diffusion, at the same time they improve finite time stability. However, this<br />

property certainly does not hold without some convexity assumption (in the steep case<br />

for instance).<br />

6.1.0.3. The optimality of the exponent a, in connection with the minimal time of<br />

instability, was first studied by Bessi who introduced powerful variational methods to<br />

revisit Arnold’s example and estimate the time of diffusion. In [Bes96] and [Bes97b],<br />

he proved that the latter is of order exp ε−1 <br />

− 2 for n = 3 and exp ε 1<br />

4 for n = 4.<br />

Moreover, in Bessi’s example the solution passes close to a double resonance, and so<br />

the time is the best possible in this case, in view of the values of the local exponent<br />

a2 for n = 3 and n = 4. Recently, using similar variational arguments, this result has<br />

been generalized to an arbitrary number of degrees of freedom n by Ke Zhang ([Zha09]),<br />

namely he constructed a special orbit passing close to a double resonance for which the<br />

time of diffusion is estimated by exp 1<br />

− <br />

ε 2(n−2) .<br />

Another approach has been proposed by Marco-Sauzin ([MS02]) and Lochak-Marco<br />

([LM05]), following novel ideas of Herman. In [MS02] the authors show that Nekhoroshev’s<br />

estimates extend to perturbations of quasi-convex Hamiltonians which are α-<br />

Gevrey regular, α ≥ 1, with the exponents<br />

and local exponents<br />

a = (2αn) −1 , b = (2n) −1<br />

am = (2α(n − m)) −1 , bm = (2(n − m)) −1 .<br />

Note that 1-Gevrey functions are exactly analytic functions, and basically when α ranges<br />

from one to infinity α-Gevrey functions interpolate between analytic and C∞ functions.<br />

Therefore this result generalizes the estimates in the analytic case. Using a geometric<br />

mechanism different and more precise than Arnold’s, in [MS02] the authors constructed<br />

a drifting orbit with a time of order exp ε −<br />

1 <br />

2α(n−2) in the non-analytic case, that is when


6.2 - Main results 133<br />

α > 1. Adding some more technical ideas, it is shown in [LM05] that the example also<br />

works in the analytic case, but the time is estimated as exp 1<br />

− <br />

ε 2(n−3) , which is only close<br />

to optimal (however refinements are certainly possible to reach the value (2(n − 2)) −1<br />

in this class of examples).<br />

Therefore, if the unperturbed Hamiltonian is quasi-convex, the best exponent of<br />

stability a up to now satisfies<br />

in the analytic case and more generally<br />

(2n) −1 ≤ a < (2(n − 2)) −1<br />

(2αn) −1 ≤ a < (2α(n − 2)) −1<br />

in the Gevrey case. The goal of this chapter is to improve the lower bound both in the<br />

analytic or Gevrey case, so as to have<br />

and<br />

(2(n − 1)) −1 − δ ≤ a < (2(n − 2)) −1<br />

(2α(n − 1)) −1 − δ ≤ a < (2α(n − 2)) −1<br />

for δ > 0 but arbitrarily small (see Theorem 6.1 and Theorem 6.2 in the next section).<br />

We believe that this bound is optimal, in the sense that one could reach the value<br />

(2(n − 1)) in Arnold diffusion, using a significantly different mechanism of instability.<br />

6.2 Main results<br />

6.2.0.1. In order to state our main results, let us now describe our setting more<br />

precisely, beginning with the analytic case. Let B = B(0, R) be the open ball of R n of<br />

radius R > 0, with respect to the supremum norm | . |, centered at the origin. Given<br />

s > 0, we let As(D) be the space of bounded real-analytic functions on D = T n × B<br />

which extend as holomorphic functions on the complex domain<br />

Ds = {(θ, I) ∈ (C n /Z n ) × C n | |I(θ)| < s, d(I, B) < s},<br />

and which are continuous on the closure of Ds. Here we denoted by I(θ) the imaginary<br />

part of θ, by | . | the supremum norm on C n , and by d the associated distance on C n . It<br />

is well-known that As(D) is a Banach space with its usual supremum norm | . |s, where<br />

In the following, we shall denote by<br />

|f|s = sup |f(z)|, f ∈ As(D).<br />

z∈Ds<br />

Bs = {I ∈ R n | d(I, B) < s}<br />

the real part of our domain Ds in action space. The geometric parameters n, R, s are<br />

assumed to be chosen once and for all in the following.


134 Improved exponential stability for quasi-convex Hamiltonian systems<br />

We now introduce the parameters related to the choice of the system. The integrable<br />

part h : Bs → R will be assumed to be strictly quasi-convex: the gradient map ∇h does<br />

not vanish and there exists a positive number m such that<br />

∇ 2 h(I)v.v ≥ m|v| 2<br />

(QC(m))<br />

holds for any I ∈ Bs and any v orthogonal to ∇h(I) (with respect to the Euclidean<br />

scalar product). Moreover, the derivatives up to order 3 of h on Bs are assumed to be<br />

bounded: there exists M > 0 such that for all I ∈ Bs, one has<br />

|∂ k h(I)| ≤ M, 1 ≤ |k1| + · · · + |kn| ≤ 3. (B(M))<br />

Therefore we will consider systems of the form<br />

⎧<br />

H(θ, I) = h(I) + f(θ, I),<br />

⎪⎨<br />

h ∈ As(D), f ∈ As(D),<br />

⎪⎩<br />

h satisfies (QC(m)) and (B(M)),<br />

|f|s < ε.<br />

(C(M, m, ε))<br />

Note that we suppress of the geometric parameters in the notation. In the following<br />

we will call a stable constant (in the analytic case) any positive constant c which depends<br />

on the whole set of parameters, that is n, R, s, M, m, together with a parameter δ or<br />

ρ to be defined below, but not on a particular choice of H satisfying the condition<br />

(C(M, m, ε)).<br />

6.2.0.2. The main result of the chapter is the following.<br />

Theorem 6.1. Consider a real number δ satisfying<br />

0 < δ ≤ (2n(n − 1)) −1 .<br />

Then there exist stable constants c1, c2, c3 and ε0 such that if 0 ≤ ε ≤ ε0, and if H<br />

satisfies (C(M, m, ε)), the following estimates<br />

|I(t) − I0| ≤ c1ε δ(n−1) 1<br />

−<br />

, |t| ≤ c2 exp c3ε 2(n−1) +δ<br />

hold true for every initial action I0 ∈ B(0, R/2).<br />

Moreover, consider a real number ρ satisfying 0 < ρ < R/2. Then there exist stable<br />

constants c4, c5 and ˜ε0 such that if 0 ≤ ε ≤ ˜ε0, then<br />

for every I0 ∈ B(0, R/2).<br />

|I(t) − I0| ≤ ρ, |t| ≤ c4 exp 1<br />

− <br />

c5ε 2(n−1)<br />

Choosing our constant δ arbitrarily close to zero, our result ensures stability for a<br />

time scale which is arbitrarily close to exp 1<br />

− <br />

ε 2(n−1) , therefore we improve the previous<br />

results of stability obtained independently by Lochak-Neishtadt ([Loc92] and [LN92])<br />

and Pöschel ([Pös93]), which were believed to be optimal.


6.2 - Main results 135<br />

In fact in the extreme case where δ = (2n(n −1)) −1 , which in our situation gives the<br />

worst stability time (but of course the best radius of confinement), our result reads<br />

|I(t) − I0| ≤ c1ε 1<br />

2n, |t| ≤ c2 exp 1 <br />

−<br />

c3ε 2n<br />

and we recover the previous result of stability. Hence, when our parameter δ ranges from<br />

(2n(n −1)) −1 to zero, our theorem "interpolates" between previous stability results and<br />

what should be the optimal stability.<br />

Indeed, in the other extreme case which corresponds to the second part of our theorem,<br />

our result does not give stability, since the radius of confinement can be arbitrarily<br />

small but no longer tends to 0 with ε. We believe that this is not an artefact of the<br />

method and that instability should occur at this precise time scale, at a time of order<br />

exp 1<br />

− <br />

ε 2(n−1) . We plan to construct an example with an unstable orbit which has a drift<br />

of order one during such a time interval. This necessitates the use of a more refined<br />

instability mechanism in the neighbourhood of double resonances, a topic which is also<br />

crucial in connection with the problem of genericity of Arnold diffusion.<br />

6.2.0.3. Our result also holds if the Hamiltonian is only Gevrey regular. Let us recall<br />

that given α ≥ 1 and L > 0, a function H ∈ C∞ (D) is (α, L)-Gevrey if, using the<br />

standard multi-index notation, we have<br />

|H|α,L = <br />

l∈N 2n<br />

L |l|α (l!) −α |∂ l H|D < ∞<br />

where | . |D is the usual supremum norm for functions on D. The space of such functions,<br />

with the above norm, is a Banach space that we denote by G α,L (D). Analytic functions<br />

are a particular case of Gevrey functions, as one can check that G 1,L (D) = AL(D).<br />

Let us introduce the main condition on the Hamiltonian systems in the Gevrey case<br />

⎧<br />

H(θ, I) = h(I) + f(θ, I),<br />

⎪⎨<br />

h ∈ G<br />

⎪⎩<br />

α,L (D), f ∈ Gα,L (D),<br />

(C(α, L, M, m, ε))<br />

h satisfies (QC(m)) and (B(M)),<br />

|f|α,L < ε.<br />

We now call a stable constant (in the Gevrey case) any positive constant c which<br />

depends on the whole set of parameters, that is α, L, n, R, M, m, together with a parameter<br />

δ or ρ which will be defined below, but not on a particular choice of H satisfying<br />

the condition (C(α, L, M, m, ε)).<br />

6.2.0.4. Our second result is the following Gevrey version of Theorem 6.1.<br />

Theorem 6.2. Consider a real number δ satisfying<br />

0 < δ ≤ (2αn(n − 1)) −1 .<br />

Then there exist stable constants c ′ 1, c ′ 2, c ′ 3 and ε ′ 0 such that if 0 ≤ ε ≤ ε ′ 0, and if H<br />

satisfies C(α, L, M, m, ε), the following estimates<br />

|I(t) − I0| ≤ c ′ 2δ<br />

1ε 5(n−1), |t| ≤ c ′ 2 exp<br />

<br />

c ′ 3ε− 1<br />

2α(n−1) +δ<br />

,


136 Improved exponential stability for quasi-convex Hamiltonian systems<br />

hold true for every initial action I0 ∈ B(0, R/2).<br />

Moreover, consider a real number ρ satisfying 0 < ρ < R/2. Then there exist stable<br />

constants c ′ 4 , c′ 5 and ˜ε′ 0 such that if 0 ≤ ε ≤ ˜ε′ 0 , then<br />

for every I0 ∈ B(0, R/2).<br />

|I(t) − I0| ≤ ρ, |t| ≤ c ′ 4 exp c ′ 5ε− 1<br />

2α(n−1)<br />

The same remarks as above apply in the Gevrey case. In particular δ can be chosen<br />

arbitrarily close to zero and our result ensures stability for an interval of time which is<br />

arbitrarily close to exp ε −<br />

1 <br />

2α(n−1) . However, our radius of stability is worse than in the<br />

analytic case, so we do not fully recover the result obtained in [MS02], but of course the<br />

time of stability is the most important issue.<br />

6.2.0.5. To avoid cumbersome expressions in the following, when there is no risk of<br />

confusion we will replace the stable constants with a dot. More precisely, an assertion<br />

of the form "there exists a stable constant c such that f < c g" will be simply replaced<br />

with "f


6.3 - The analytic case 137<br />

Given a real number K ≥ 1, we will say that Λ is a K-submodule if it admits a Z-basis<br />

{k 1 , . . .,k r } satisfying |k i |1 ≤ K for i ∈ {1, . . ., r}, where<br />

|k i |1 = |k i 1| + · · · + |k i n|.<br />

As usual, it is enough to consider only maximal submodules, which are those that are not<br />

strictly contained in any other submodule of the same rank. Given such a submodule,<br />

we define its volume by<br />

|Λ| = √ det t MM<br />

where M is any n × r matrix whose columns form a basis for Λ (this is easily seen to<br />

be independent of the choice of such a matrix). The following stability theorem is due<br />

to Pöschel.<br />

Theorem 6.3 (Pöschel). Let Λ be a K-submodule of Z n of rank r, with r ∈ {0, . . .,n−<br />

1}. Assume that ε ≥ 0 and K ≥ 1 satisfy<br />

ε|Λ| 2 K 2(n−r)


138 Improved exponential stability for quasi-convex Hamiltonian systems<br />

Lemma 6.4. Let Λ be a K-submodule of Z n of rank 1. Assume that ε ≥ 0 and K ≥ 1<br />

satisfy<br />

εK 2n


6.3 - The analytic case 139<br />

Lemma 6.6. Let I be a closed interval of length l > 0 contained in [−1, 1]. Then there<br />

exists a rational number p/q ∈ I ∩ Q satisfying<br />

|q| + |p| < (4 √ 2)l −1 2.<br />

The exponent in l −1<br />

2 comes from the use of Dirichlet’s theorem on the approximation<br />

of real numbers by rational ones, but this result is not necessary, as in the sequel a trivial<br />

bound of order l −1 would be enough.<br />

Proof. Let us write I = [x − l/2, x + l/2] for some x ∈ [−1, 1], and let q be the smallest<br />

integer larger than √ 2l−1 2, that is<br />

√ 2l − 1<br />

2 ≤ q < √ 2l −1<br />

2 + 1.<br />

By Dirichlet’s theorem there exists an integer p ∈ Z such that<br />

|x − p/q| < q −2 .<br />

Since √ 2l −1<br />

2 ≤ q, we have q −2 ≤ l/2, and so<br />

|x − p/q| < l/2<br />

which means that p/q ∈ I. Moreover, as I ⊂ [−1, 1], then |p| ≤ q and<br />

|q| + |p| ≤ 2q.<br />

Recall that q < √ 2l −1<br />

2 + 1, but since I ⊆ [−1, 1], we have l ≤ 2, so that 1 ≤ √ 2l −1<br />

2 and<br />

therefore<br />

This gives<br />

which concludes the proof.<br />

q < (2 √ 2)l −1 2.<br />

|q| + |p| < (4 √ 2)l −1<br />

2<br />

The following result is our main lemma. It essentially says that a (long) drifting<br />

orbit has to cross a simple resonance, since all other orbits are stable on the interval of<br />

time over which they are defined.<br />

Lemma 6.7. Consider ε ≥ 0 and K ≥ 1 such that<br />

K −2


140 Improved exponential stability for quasi-convex Hamiltonian systems<br />

Once again, the exponent in K −2 comes from Lemma 6.6 and hence from Dirichlet’s<br />

theorem, but a bound of order K −1 would be sufficient for the final result. Let us also<br />

add that if τ is the maximal time of existence of the solution within the initial domain<br />

T n × B, then our stability estimate easily ensures that τ = +∞, which in turn implies<br />

stability for all time (see the proof of Theorem 6.1).<br />

Proof. We will make conditions (47) explicit and prove that<br />

when<br />

|I(t) − I0| < 32 C −1 K −2 , 0 ≤ t < τ,<br />

K −2 < (32) −1 Cρ0, εK 2 < 16,<br />

where ρ0 and C are the stable constants of Lemma 6.5.<br />

We will argue by contradiction, so we assume that there exists a time ˜t for which<br />

Consider the curve<br />

and let ρ = 32 C −1 K −2 . Then<br />

|I(˜t) − I0| ≥ 32 C −1 K −2 .<br />

σ(t) = (I(t), |ω(t)| −1 ) ∈ B × R +<br />

t ∗ = inf{t ∈ [0, τ[ | σ(t) /∈ B(σ(0), ρ)}<br />

is well-defined, as the above set contains ˜t. Now ρ < ρ0, so we can apply Lemma 6.5: the<br />

restriction of Ψh to the open ball B(σ(0), ρ) is a diffeomorphism whose image contains<br />

the closed ball B(Ψh(σ(0)), 32 K −2 ). Considering a slightly larger ball over which Ψh<br />

remains a diffeomorphism, this easily implies that<br />

Ψh(σ(t ∗ )) /∈ B(Ψh(σ(0)), 32 K −2 ),<br />

that is, h(I(t ∗ )), |ω(t ∗ )| −1 ω(t ∗ ) − h(I0), |ω(0)| −1 ω(0) ≥ 32 K −2 .<br />

Using the conservation of energy, one has<br />

so that necessarily<br />

|h(I(t ∗ )) − h(I0)| < 2ε < 32 K −2<br />

<br />

|ω(t ∗ )| −1 ω(t ∗ ) − |ω(0)| −1 ω(0) ≥ 32 K −2 .<br />

Therefore there exists an index i ∈ {1, . . ., n} such that<br />

<br />

<br />

<br />

ωi(t<br />

<br />

∗ )<br />

|ω(t∗ <br />

ωi(0) <br />

− <br />

)| |ω(0)| ≥ 32 K−2<br />

and this estimate means that the image of the interval [0, t∗ ] under the continuous<br />

function<br />

t ↦−→ ωi(t)<br />

∈ [−1, 1]<br />

|ω(t)|


6.3 - The analytic case 141<br />

contains a non trivial interval I of length l = 32 K −2 . Now we can apply Lemma 6.6 to<br />

find a rational number p/q ∈ Q in reduced form and a time t ′ ∈ [0, t ∗ ] such that<br />

with<br />

ωi(t ′ )<br />

|ω(t ′ )|<br />

= p<br />

q<br />

(48)<br />

|p| + |q| < 4 √ 2(32 K −2 ) −1<br />

2 = K. (49)<br />

But since |ω(t ′ )| = |ωj(t ′ )| for some j ∈ {1, . . ., n}, and replacing p with −p if ωj(t ′ ) is<br />

negative, the equality (48) can be written as<br />

qωi(t ′ ) − pωj(t ′ ) = 0. (50)<br />

Now let us write k ′ = qei − pej ∈ Z, then from (49) and (50) we have<br />

k ′ .ω(t ′ ) = 0, |k ′ |1 < K,<br />

and since the submodule generated by k ′ is maximal, since p and q are co-prime, we find<br />

that ω(t ′ ) ∈ RK. This gives the desired contradiction.<br />

6.3.0.4. We finally arrive to the proof of Theorem 6.1.<br />

Proof of Theorem 6.1. We choose K of the form<br />

<br />

ε0<br />

K = K0<br />

ε<br />

with suitable stable constants K0 and ε0 so that conditions (46) and (47) are satisfied if<br />

γ<br />

0 ≤ ε ≤ ε0, 0 < γ ≤ (2n) −1 .<br />

With this threshold and these bounds on γ, both Lemma 6.4 and Lemma 6.7 can be<br />

applied. Let (θ0, I0) ∈ T n × B(0, R/2) and T be the maximal time of existence within<br />

T n × B(0, R) of the solution (θ(t), I(t)) starting at (θ0, I0). We have to distinguish two<br />

cases.<br />

In the first case, we assume that ω(t) ∈ NK for all t < T. Then we apply Lemma 6.7<br />

with τ = T to get<br />

|I(t) − I0| 0), is included in<br />

B(0, R). Therefore this solution is defined for all time, that is T = +∞, and so the<br />

previous estimate gives<br />

|I(t) − I0|


142 Improved exponential stability for quasi-convex Hamiltonian systems<br />

Again, taking ε0 small enough we can ensure that I(t ∗ ) ∈ B(0, R/2), then we can apply<br />

Lemma 6.4 to the solution It ∗(t) = I(t + t∗ ), whose initial frequency belongs to RK, to<br />

obtain<br />

Setting<br />

this gives<br />

|It∗(t) − It∗(0)|


6.4 - The Gevrey case 143<br />

6.4 The Gevrey case<br />

In this section, we will prove Theorem 6.2. In fact, it will be enough to have a version of<br />

Lemma 6.4 in the Gevrey case, as the geometric considerations of the previous section<br />

still apply with no changes.<br />

Here we shall use a result from [MS02], which follows the method introduced by<br />

Lochak ([Loc92]) from which the results of improved stability near resonances actually<br />

originate. In the latter approach, the notion of "order" of a resonance is more intrinsic,<br />

however it is also more difficult to compute.<br />

Let Λ be a submodule of Zn of rank r, with r ∈ {1, . . ., n}, and choose a basis<br />

{k1 , . . .,k r } ∈ Zn for Λ. We define the matrix L of size r ×n with integer entries whose<br />

rows are given by the vectors ki = (ki 1 , . . .,ki n ), 1 ≤ i ≤ r, that is<br />

⎛ ⎞<br />

k 1 1 · · · k 1 n<br />

⎜<br />

L = ⎝ . .<br />

kr 1 · · · kr ⎟<br />

⎠ ∈ Mr,n(Z).<br />

n<br />

Then it is an elementary result of linear algebra that there exists integers d1, . . .,dr ∈ Z,<br />

satisfying the divisibility conditions d1| . . . |dr, such that L is equivalent to the diagonal<br />

matrix<br />

⎛<br />

d1<br />

⎜<br />

∆ = ⎝<br />

.. .<br />

0 0 · · ·<br />

.<br />

⎞<br />

0<br />

⎟<br />

⎠ ∈ Mr,n(Z).<br />

0 dr 0 · · · 0<br />

Therefore one can write<br />

for some matrices A ∈ GL(n, Z) and B ∈ GL(r, Z).<br />

L = B∆A (51)<br />

The numbers di are called the invariant factors of the module, and for a maximal<br />

module one can show that these numbers are all equal to one. The above normal form<br />

result can be proved equivalently by elementary operations on rows and columns or<br />

using the structure of finitely generated modules over a principal domain.<br />

One can easily check that t A sends the standard submodule (which is the one generated<br />

by the first r vectors of the canonical basis of Z n ) to the submodule Λ. So quantitative<br />

information about the submodule is encoded in those matrices A ∈ GL(n, Z).<br />

Following Lochak, we define cΛ (resp. c ′ Λ ) as the minimal value of the norm |A−1 |<br />

(resp. of |A|) among all matrices A ∈ GL(n, Z) satisfying the relation (51) (it is easy<br />

to see that those constants depend only on Λ and not on the choice of such a matrix).<br />

In the space Mn(Z) we may choose the norm | . | induced by the usual supremum norm<br />

for vectors, which is nothing but the maximum of the sums of the absolute values of the<br />

elements in each row.<br />

With these definitions, one can state the following stability result in the Gevrey<br />

class.<br />

Theorem 6.8 (Marco-Sauzin). Let Λ be a K-submodule of Z n of rank r, with r ∈<br />

{0, . . ., n − 1}. Assume that ε ≥ 0 satisfies<br />

εc 5(n−r)<br />

Λ


144 Improved exponential stability for quasi-convex Hamiltonian systems<br />

and H satisfies (C(α, L, M, m, ε)). Then for any solution (θ(t), I(t)) with I0 ∈ B(0, R/2)<br />

and d(ω(0), RΛ)


6.4 - The Gevrey case 145<br />

with the estimates<br />

|u| ≤ |y| |x|<br />

, |v| ≤<br />

d d .<br />

To see this, note that the existence of at least one solution u0, v0 for the above<br />

equation follows easily from the Euclidean division algorithm. Then obviously<br />

u = u0 − k y<br />

d , v = v0 − k x<br />

d ,<br />

is also a solution for any k ∈ Z. Therefore choosing k properly we can find at least one<br />

solution u, v with<br />

|u| ≤ |y|<br />

− 1.<br />

d<br />

For this solution one has<br />

since d − |x| ≤ 0, so<br />

|v||y| = |d − ux|<br />

≤ d + |ux|<br />

≤ d + ( |y|<br />

− 1)|x|<br />

d<br />

≤ |y|<br />

|x| + d − |x|<br />

d<br />

≤ |y|<br />

d |x|<br />

|v| ≤ |x|<br />

d<br />

which proves our claim. Finally, if for instance y = 0, then d = x and one can obviously<br />

choose u = 1 and v = 0.<br />

6.4.0.3. Let K ≥ 1 be given. We can now state our induction hypo<strong>thesis</strong> H(n) for<br />

n ≥ 1.<br />

H(n). Let k = (k1, . . ., kn) be a vector in Z n \ {0} with co-prime components such<br />

that |k|1 ≤ K. Then there exists a matrix A ∈ GL(n, Z) with first row equal to k, which<br />

satisfies |A| ≤ K.<br />

The assertion H(1) is immediate since in this case k = (±1). Now for n ≥ 2,<br />

assume that H(n-1) holds true and consider k = (k1, . . .,kn) in Z n \ {0} with co-prime<br />

components and |k|1 ≤ K.<br />

We may suppose that k∗ = (k1, . . .,kn−1) is non zero (otherwise we consider k∗ =<br />

(k2, . . .,kn)) and we set d = gcd(k1, . . .,kn−1). So d ≥ 1, the integers d −1 k1, . . .,d −1 kn−1<br />

are co-prime, and<br />

|d −1 k∗|1 ≤ K<br />

d .<br />

By H(n-1) we can find a matrix<br />

⎛<br />

⎜<br />

⎝<br />

d −1 k1 · · · d −1 kn−1<br />

l2,1 · · · l2,n−1<br />

.<br />

ln−1,1 · · · ln−1,n−1<br />

.<br />

⎞<br />

⎟ ∈ GL(n − 1, Z),<br />


146 Improved exponential stability for quasi-convex Hamiltonian systems<br />

such that<br />

for each i ∈ {2, . . ., n − 1}.<br />

n−1<br />

| li,j| ≤ K,<br />

j=1<br />

Now since d and kn are co-prime, one can find integers u, v ∈ Z such that<br />

ud + vkn = 1<br />

and therefore define a matrix<br />

⎛<br />

⎜<br />

A(u, v) = ⎜<br />

⎝<br />

k1<br />

l2,1<br />

.<br />

ln−1,1<br />

· · ·<br />

· · ·<br />

· · ·<br />

kn−1<br />

l2,n−1<br />

.<br />

ln−1,n−1<br />

kn<br />

0<br />

.<br />

0<br />

(−1) n−1 vd −1 k1 · · · (−1) n−1 vd −1 kn−1 (−1) n−1 u<br />

Expanding the determinant along the last column easily proves that A(u, v) ∈ GL(n, Z).<br />

As for the estimates, first assume that kn = 0. Then d = 1 and we may choose u = 1<br />

and v = 0, so obviously<br />

|A(1, 0)| ≤ K.<br />

If now kn is non zero, then by our previous remark we can choose (u∗, v∗) so that<br />

|u∗| ≤ |kn|, |v∗| ≤ d,<br />

which proves that the ℓ 1 –norm of the last row is bounded by K, and therefore<br />

|A(u∗, v∗)| ≤ K. This ends the proof.<br />

With these estimates, one can deduce from Theorem 6.8 the following lemma.<br />

Lemma 6.9. Let Λ be a K-submodule of Zn of rank 1. Assume that ε ≥ 0 and K ≥ 1<br />

satisfy<br />

εK 5(n−1)2<br />


6.4 - The Gevrey case 147<br />

with suitable stable constants K0 and ε0 so that conditions (47) and (52) are satisfied if<br />

0 ≤ ε ≤ ε0, 0 < γ ≤ 5 −1 (n − 1) −2 .<br />

Then we can apply both Lemma 6.4 and Lemma 6.7 in the same way as in the proof<br />

of Theorem 6.2, and setting<br />

aγ =<br />

1 − 5γ(n − 1)2<br />

, bγ =<br />

2α(n − 1)<br />

1 − γ(n − 1)(3n − 1)<br />

,<br />

2(n − 1)<br />

we find that all solutions (θ(t), I(t)) starting at (θ0, I0), with I0 ∈ B(0, R/2), satisfy<br />

Next, using our condition<br />

we have the bounds<br />

hence γ ≤ bγ so that<br />

To conclude, just set<br />

so that<br />

hence<br />

provided that<br />

from which we will only retain<br />

|I(t) − I0|


148 Improved exponential stability for quasi-convex Hamiltonian systems


Part IV<br />

Results of instability<br />

Summary<br />

7 Optimal time of instability for a priori unstable Hamiltonian systems<br />

151<br />

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151<br />

7.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153<br />

7.3 Construction of the perturbation . . . . . . . . . . . . . . . . . . . . 155<br />

7.3.1 Integrable system . . . . . . . . . . . . . . . . . . . . . . . . . 156<br />

7.3.2 Perturbed system . . . . . . . . . . . . . . . . . . . . . . . . . 157<br />

7.4 Construction of a symbolic dynamic . . . . . . . . . . . . . . . . . . 159<br />

7.4.1 Symbolic dynamic . . . . . . . . . . . . . . . . . . . . . . . . 161<br />

7.4.2 Proof of Proposition 7.2 . . . . . . . . . . . . . . . . . . . . . 166<br />

7.5 Construction of a pseudo-orbit . . . . . . . . . . . . . . . . . . . . . 176<br />

7.6 Proof of Theorem 7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 185<br />

7.A Time-energy coordinates for the pendulum . . . . . . . . . . . . . . . 191<br />

8 Time of instability for high-dimensional Hamiltonian systems 193<br />

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193<br />

8.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />

8.3 Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 196<br />

8.3.1 The mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 197<br />

8.3.2 Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . . 204<br />

8.A Gevrey functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207<br />

Références 209<br />

149


150


7.1 - Introduction 151<br />

7 Optimal time of instability for a priori unstable<br />

Hamiltonian systems<br />

In this chapter, we consider an a priori unstable Hamiltonian system with three degrees<br />

of freedom, for which we construct a drifting solution with an optimal time of diffusion.<br />

Such a result has been already proved by Berti, Bolle and Biasco using variational<br />

arguments, and by Treschev with his separatrix map theory. Our approach is different,<br />

it is based on polysystems, which are a special type of symbolic dynamics corresponding<br />

to the random iteration of a family of maps.<br />

7.1 Introduction<br />

The theory of perturbations of Hamiltonian systems is essentially the study of nearintegrable<br />

Hamiltonian systems, generated by functions of the form<br />

H(θ, I) = h(I) + f(θ, I), (θ, I) ∈ A n = T n × R n ,<br />

where H is sufficiently smooth and f sufficiently small.<br />

7.1.0.1. For f = 0, H = h is an integrable system and the situation is well-understood.<br />

In this case, the variables (θ, I) are called angle-action coordinates for h and the Hamiltonian<br />

depends only on the action variables. The phase space A n is trivially foliated by<br />

invariant Lagrangian tori parametrized by the action variables: indeed the equations of<br />

motion of H read ˙θ = ∇h(I)<br />

˙<br />

I = 0,<br />

so for each I0 ∈ R n , the torus T0 = T n × {I0} is Lagrangian and invariant under the<br />

Hamiltonian flow. The latter is complete and restricts to a linear flow on T0 with<br />

frequency ω0 = ∇h(I0), that is<br />

Φ H t (θ0, I0) = (θ0 + tω0 [Z n ], I0) ∈ T0, t ∈ R.<br />

The action variables remain fixed for all times, and all solutions are quasi-periodic.<br />

7.1.0.2. Now if we let f to be non zero but small, that is<br />

|f| = ε 0, which is called the diffusion time.


152 Optimal time of instability for a priori unstable Hamiltonian systems<br />

7.1.0.3. However, showing that such drifting solutions exist in some large class of nearintegrable<br />

Hamiltonian systems is a very difficult task. Indeed, on the one hand KAM<br />

theory ([Kol54], [Mos62] and [Arn63a]) gives, under some non-degeneracy condition on<br />

h and provided ε is small enough, the existence of a set of positive measure of quasiperiodic<br />

solutions. These solutions are perpetually stable, in the sense that the variation<br />

of their action components is of order at most √ ε for all time. Moreover, for n = 2<br />

and if the integrable Hamiltonian is isoenergetically non degenerate ([Arn63a]), one can<br />

even show that all solutions are stable for all time. On the other hand, for n ≥ 3 we<br />

know from Nekhoroshev theory ([Nek77], [Nek79]) that all solutions are stable, not for<br />

all time, but for an interval of time which is exponentially long with respect to ε −1 ,<br />

provided that h meets some quantitative transversality condition and the perturbation<br />

is sufficiently small.<br />

But even though the existence of drifting orbits for near-integrable Hamiltonian systems<br />

seems to be quite exceptional, Arnold conjectured that this topological instability<br />

is in fact a "typical" phenomenon (see [Arn63a] or [AKN06]). We refer to [Loc99] for a<br />

very lucid and enlightening discussion on Arnold mechanism (and on Arnold diffusion in<br />

general). Under some "generic" conditions and for n = 3, Mather ([Mat04]) announced<br />

that this conjecture holds if the unperturbed Hamiltonian is convex (the convexity is<br />

required by the use of his variational methods). This is so far the best result in this<br />

direction, however, his proof is highly technical and still incomplete. Another connected<br />

question, which is even harder, is to find orbits that densely fill the energy level (this<br />

is the so-called quasi-ergodic hypo<strong>thesis</strong>) : for instance, in [Her98] it is asked whether<br />

there exists a Hamiltonian system, C ∞ -smooth and for r ≥ 2, C r -close to the integrable<br />

Hamiltonian h(I) = 1<br />

2 |I|2 , which has a dense orbit on an energy level (progress towards<br />

this question have been made recently in [KLS10] and [KZZ09]).<br />

7.1.0.4. Yet there is another case, which is much simpler and hence more studied<br />

and understood, in which the unperturbed Hamiltonian is completely integrable (in<br />

the sense of symplectic geometry) but possesses some "hyperbolicity". The prototype<br />

for such a system is given by an uncoupled product of rotators with a pendulum, as<br />

in Arnold’s example. This case is usually referred to as a priori unstable (following<br />

the terminology introduced in [CG94]), in contrast with the a priori stable case where<br />

the unperturbed system is integrable in angle-action coordinates and therefore has no<br />

hyperbolic feature (one can also say that a priori stable systems are "fully elliptic").<br />

The interest in studying such a priori unstable systems is in fact double. First, one can<br />

use the original strategy of Arnold to find unstable orbits, by constructing and following<br />

a "transition chain" made of sets with suitable hyperbolic properties (or minimizing<br />

properties when using variational arguments). The second interest lies in the fact that,<br />

due to normal form theory, the study of a priori stable systems in a neighbourhood<br />

of a simple resonance reduces to the a priori unstable case (see [LMS03] for example).<br />

Results on the topological instability of a priori unstable systems under fairly general<br />

conditions can be found in [DdlLS06] and [Tre04] by means of geometrical methods<br />

and in [CY04], [Ber08] and [CY09] by variational methods. However there are many<br />

difficulties in using these results to tackle Arnold’s conjecture which concerns a priori<br />

stable systems.<br />

7.1.0.5. Another feature of a priori unstable systems is that one avoids all exponentially<br />

small phenomena which are typical for analytic (or Gevrey) a priori stable


7.2 - Main results 153<br />

systems, as the lower bound on the diffusion time imposed by Nekhoroshev’s theorem.<br />

Indeed, if µ denotes the small parameter in the a priori unstable case, then it was<br />

realized by Lochak that the time of diffusion should be polynomial, and then Bernard<br />

([Ber96]), adapting Bessi’s work ([Bes96]), showed that one can obtain an upper bound<br />

on the time of diffusion which is of order µ −2 . In [Loc99], it was conjectured that the<br />

optimal time of diffusion should be µ −1 ln µ −1 , and this was proved by Berti, Bolle and<br />

Biasco [BBB03], still using Bessi’s ideas, and also by Treschev ([Tre04]) and Cresson<br />

and Guillet ([CG03]), using different geometric methods.<br />

The goal of this chapter is to give yet another method to construct an example<br />

with this optimal time of diffusion. Our approach is dynamical: it uses the notion of<br />

polysystem introduced by Moeckel ([Moe02]) in the context of Arnold diffusion, and<br />

which corresponds to the random iteration of a family of maps. More precisely, our<br />

example uses an explicit construction of a polysystem, which is similar to the abstract<br />

mechanism introduced by Marco in [Mar08].<br />

7.2 Main results<br />

7.2.0.1. For n ≥ 1, let us recall that a function f ∈ C ∞ (A n ) is α-Gevrey, for α ≥ 1, if<br />

for any compact subset K ⊆ A n there exist two positive constants AK, BK such that<br />

|∂ k f| C 0 (K) ≤ AKB |k|<br />

K (k!)α , k ∈ N 2n ,<br />

with the standard multi-index notation. We shall denote by G α (A n ) the space of such<br />

functions. For α = 1, these are exactly the analytic functions, but for α > 1, the space<br />

G α (A n ) contains non zero compactly-supported functions (we refer to [MS02], Appendix<br />

A, for more details). Our perturbation will be α-Gevrey for α > 1.<br />

In all this paper, we will consider a Hamiltonian system h with two degrees of freedom<br />

defined by<br />

h(θ1, θ2, I1, I2) = 1<br />

2 (I2 1 + I 2 2) + cos 2πθ1, (θ1, θ2, I1, I2) ∈ A 2 ,<br />

which is the direct product of a pendulum P(θ1, I1) = 1<br />

2I2 1 + cos 2πθ1 on the first factor<br />

and a standard rotator S(θ2, I2) = 1<br />

2I2 2 on the second factor. It is the simplest example<br />

of an a priori unstable integrable Hamiltonian system.<br />

Below we shall state two versions of our theorem, one for Hamiltonian diffeomorphisms<br />

(Theorem 7.2) and one for Hamiltonian flows (Theorem 7.1). For the discrete<br />

case, our unperturbed diffeomorphism is the time-one map of the Hamiltonian flow generated<br />

by h, Φ h : A 2 → A 2 . In the continuous case, our unperturbed system is the a<br />

priori unstable Hamiltonian with three degrees of freedom defined by<br />

ˆh(θ, I) = h(θ1, θ2, I1, I2) + I3 = 1<br />

2 (I2 1 + I2 2 ) + I3 + cos 2πθ1.<br />

7.2.0.2. Let us state our main result.


154 Optimal time of instability for a priori unstable Hamiltonian systems<br />

Theorem 7.1. For α > 1, there exist positive constants C, µ0 and a function f ∈<br />

G α (A 3 ) such that if 0 < µ ≤ µ0, the Hamiltonian system<br />

has an orbit (θ(t), I(t))t∈R such that<br />

with the estimates<br />

H(θ, I) = ˆ h(θ, I) + µf(θ, I), (θ, I) ∈ A 3 ,<br />

lim<br />

t→±∞ I2(t) = ±∞,<br />

|I2(τ) − I2(0)| ≥ 1, τ ≤ Cµ −1 lnµ −1 .<br />

Our unstable orbit is bi-asymptotic to infinity, and the estimates show that the<br />

time of diffusion is of order µ −1 ln µ −1 . As we have already said, similar examples have<br />

been constructed by means of very different methods, in [BBB03], [Tre04] and [CG03].<br />

Moreover, in [BBB03] the authors proved that for such a system the following stability<br />

estimates hold true in the analytic case: given any ρ > 0, there exist positive constants<br />

µ ′ 0 and c such that for µ ≤ µ ′ 0,<br />

|I(t) − I(0)| ≤ ρ, |t| ≤ cµ −1 ln µ −1 .<br />

Therefore this time of diffusion is optimal within the analytic category, but it should<br />

also holds with less regularity, for instance in the Gevrey case ([MS02]) but also in<br />

the C k case for k large enough (see the fifth chapter), as the time of stability is only<br />

polynomial, and in fact almost linear. However, in [MS02] and in the fifth chapter,<br />

only the quasi-convex case has been studied, and to obtain a stability result similar to<br />

[BBB03] one also needs the more general steep case.<br />

Let us also add that even though our perturbation is only Gevrey regular, using the<br />

techniques developed in [LM05] one can obtain Theorem 7.1 in the analytic case, but<br />

with considerably more work.<br />

7.2.0.3. In fact we shall not prove Theorem 7.1 directly, but the following equivalent<br />

equivalent version in terms of diffeomorphisms. Given a function H on A n , we shall<br />

denote by Φ H t : A n → A n the time-t map of its Hamiltonian flow and by Φ H = Φ H 1 the<br />

time-one map.<br />

Theorem 7.2. For α > 1, there exist positive constants C, µ0 and a function f ∈<br />

G α (A 2 ) such that if 0 < µ ≤ µ0, the diffeomorphism<br />

has an orbit (θ k , I k )k∈Z such that<br />

with the estimates<br />

Φ h ◦ Φ µf : A 2 −→ A 2<br />

lim<br />

k→±∞ Ik 2<br />

= ±∞,<br />

|I N 2 − I0 2 | ≥ 1, N ≤ Cµ−1 ln µ −1 .


7.3 - Construction of the perturbation 155<br />

Theorem 7.1 follows easily from Theorem 7.2 by a classical suspension argument (see<br />

[MS02] for a simple method in the Gevrey case using generating functions), so we shall<br />

not repeat the details. Of course, the constants µ0, C and the function f are not the<br />

same in both theorems, but we have kept the same notation for simplicity. Moreover, in<br />

the sequel we will not give explicit values for these constants, in fact sometimes it will<br />

be more convenient to use asymptotic notations: given u(µ) and v(µ) defined for µ ≥ 0,<br />

we shall write u(µ) = O(v(µ)) if there exist positive constants µ ′ and c ′ independent of<br />

µ such that the inequality u(µ) ≤ c ′ v(µ) holds true for 0 ≤ µ ≤ µ ′ .<br />

7.2.0.4. The plan of this chapter is the following. In section 7.3, we will describe<br />

the perturbation. We will show that the perturbed system has an invariant normally<br />

hyperbolic manifold, the stable and unstable manifolds of which intersect transversely<br />

along a homoclinic annulus.<br />

Then, this will be used in section 7.4 to show the following generalisation of the<br />

Birkhoff-Smale theorem to the normally hyperbolic case: near the homoclinic manifold,<br />

there exist an invariant set on which a suitable iterate of the system is conjugated to a<br />

symbolic dynamic, more precisely to a skew-product on the annulus A over a Bernoulli<br />

shift. Let us give a precise definition.<br />

Given an alphabet A ⊆ N, we let ΣA = A Z be the Cantor set of bi-infinite sequence<br />

of elements in A. We will denote by σ = σA the left Bernoulli shift on ΣA, that is if<br />

¯n = (nk)k∈Z belongs to ΣA, then σ(¯n) = (n ′ k )k∈Z is defined by<br />

n ′ k = nk+1, k ∈ Z.<br />

Definition 7.3. Let M be a manifold. A skew-product on M over σ is a map G :<br />

ΣA × M −→ ΣA × M of the form<br />

G(¯n, x) = (σ(¯n), F¯n(x)), x ∈ M,<br />

where F¯n : M → M is an arbitrary map, for ¯n ∈ ΣA.<br />

However we are not able to prove the existence of our orbit using directly this symbolic<br />

dynamic. Hence in section 7.5, we will first construct this orbit for a simplified<br />

model called polysystem, which appears as a random iteration of "standard" maps of<br />

the annulus, and which is close to our skew-product. Here is the definition.<br />

Definition 7.4. A polysystem on M is a skew-product over a Bernoulli shift of the form<br />

with fn : M → M for n ∈ A.<br />

F¯n(x) = fn0(x), ¯n ∈ ΣA, x ∈ M,<br />

Then in section 7.6, the orbit for the polysystem will be considered as a pseudo-orbit<br />

for the skew-product and we will conclude using shadowing arguments. Finally we have<br />

gathered in an appendix some estimates on the so-called time-energy coordinates for<br />

the simple pendulum that are used in section 7.4.<br />

7.3 Construction of the perturbation<br />

This section is devoted to a description of our system, which is similar to the one<br />

introduced in [Mar05]. We will first consider the integrable case, and then explains the<br />

construction of the perturbation.


156 Optimal time of instability for a priori unstable Hamiltonian systems<br />

I1<br />

O •<br />

7.3.1 Integrable system<br />

θ1<br />

S<br />

×<br />

I2<br />

Φ P Φ S<br />

Figure 1: Integrable diffeomorphism F0<br />

Our integrable diffeomorphism F0 : A 2 → A 2 is the time-one map of the Hamiltonian<br />

flow generated by h, that is<br />

F0 = Φ h = Φ 1<br />

2 (I2 1 +I2 2 )+cos 2πθ1 ,<br />

which is the product of the pendulum map Φ P = Φ 1<br />

2 I2 1 +cos 2πθ1 and the integrable twist<br />

map Φ S = Φ 1<br />

2 I2 2 (see figure 1).<br />

It is an "a priori unstable" map in the sense that it possesses an invariant normally<br />

hyperbolic annulus. Indeed, let O = (0, 0) be the hyperbolic fixed point of the pendulum<br />

map Φ P , its stable and unstable manifolds obviously coincide, and if S is the upper part<br />

of the separatrix, then<br />

W ± (O, Φ P ) = S = {(θ1, I1) ∈ A | I1 = 2 sin πθ1}.<br />

Due to the product structure of the map F0, it is easy to see that the annulus A = O×A<br />

is invariant by F0, and one can check that it is symplectic for the canonical structure of<br />

A 2 . But the most important feature is that this annulus is normally hyperbolic, more<br />

precisely it is r-normally hyperbolic for any r ∈ N, in the sense of [HPS77]. To see this,<br />

just decompose the tangent bundle of A 2 along A as<br />

TAA 2 = T A ⊕ E s ⊕ E u ,<br />

where E s (resp. E u ) is the one-dimensional contracting (reps. expanding) direction<br />

associated to the hyperbolic fixed point. Then note that the restriction of F0 to A<br />

coincides with the integrable twist map Φ S , hence there is zero contraction and expansion<br />

in the tangent direction T A.<br />

The 3-dimensional stable and unstable manifolds of A also coincide, and are given<br />

by the product<br />

W ± (A, F0) = S × A.<br />

θ2


7.3 - Construction of the perturbation 157<br />

7.3.2 Perturbed system<br />

Now we will describe the geometric properties of our perturbed system. It will be of the<br />

form<br />

Fµ = Φ µf ◦ F0,<br />

where µ > 0 is the small parameter, and f : A 2 → R a function of the form<br />

to be defined precisely below.<br />

f(θ1, θ2, I1, I2) = χ(θ1)f1(θ1, I1)f2(θ2)<br />

7.3.2.1. First χ : T → R will be a bump function. Of course such a function can be<br />

chosen to be α-Gevrey for α > 1 but not analytic. More precisely, let us choose ¯ θ ∈ T<br />

such that<br />

( ¯ θ, I) ∈ S =⇒ Φ P ( ¯ θ, I) = (1 − ¯ θ, I).<br />

Since the separatrix S is symmetric about the section {θ1 = 1/2}, ¯ θ is well-defined.<br />

Then choose any ˜ θ ∈ ] ¯ θ, 1/2[, and a function χ ∈ G α (T), α > 1, such that<br />

χ(θ) =<br />

<br />

1 if θ ∈ [ ˜ θ, 1 − ˜ θ],<br />

0 if θ /∈ [ ¯ θ, 1 − ¯ θ].<br />

Since χ is identically zero at 0, the perturbation Φ µf is the identity on the normally<br />

hyperbolic annulus A, and therefore the latter remains invariant and normally hyperbolic<br />

for the map Fµ. Moreover, as χ vanishes in a neighbourhood of 0, some pieces of the<br />

stable and unstable manifolds for Fµ will coincide with the ones of F0.<br />

7.3.2.2. Now to define the function f1 : A → R, we will use time-energy coordinates<br />

on the first factor, so we introduce the symplectic diffeomorphism<br />

where<br />

and<br />

Ψ : E × A −→ E∗ × A<br />

(θ1, I1, θ2, I2) ↦−→ (τ, e, θ2, I2)<br />

E = {(θ1, I1) ∈ A | 0 < θ1 < 1, I1 > 0}<br />

E∗ = {(τ, e) ∈ R 2 | e > −2, |τ| < 1<br />

T(e)}. (53)<br />

2<br />

We refer to Appendix 7.A for more details on those coordinates. The function f1 is<br />

simply defined by<br />

f1(τ, e) = 1 − 1<br />

2 τ2 , (τ, e) ∈ E∗,<br />

and it will create a transverse intersection between the stable and unstable manifolds<br />

W + (A, Fµ) and W − (A, Fµ).<br />

7.3.2.3. Finally the function f2 will be defined by<br />

f2(θ2) = −π −1 (2 + sin 2πθ2), θ2 ∈ T.


158 Optimal time of instability for a priori unstable Hamiltonian systems<br />

The fact the f2 is nowhere zero on T (for any θ2 ∈ T, |f2(θ2)| ≥ π −1 ) will imply that<br />

W + (A, Fµ) and W − (A, Fµ) intersect transversely along an annulus, and the explicit<br />

form of f2 will be responsible for the non trivial dynamics along this homoclinic annulus.<br />

7.3.2.4. Let us define the domains<br />

D = ([ ¯ θ, 1 − ¯ θ] × R + ∗ ) × A ⊆ A 2 , D∗ = Ψ(D) ⊆ E∗ × A,<br />

on which the perturbation (either in the original coordinates or in time-energy coordinates)<br />

is non zero, and<br />

D = ([ ˜ θ, 1 − ˜ θ] × R + ∗ ) × A ⊆ A 2 , D∗ = Ψ( D) ⊆ E∗ × A.<br />

For (τ, e, θ2, I2) ∈ D∗, our perturbation can be written explicitly as<br />

Φ µf (τ, e, θ2, I2) = (τ, e − µf ′ 1 (τ)f2(θ2), θ2, I2 − µf1(τ)f ′ 2 (θ2))<br />

and otherwise the diffeomorphism Φ µf is the identity.<br />

= (τ, e + µτf2(θ2), θ2, I2 + 2µf1(τ) cos(2πθ2)), (54)<br />

7.3.2.5. In the sequel we shall need the following property.<br />

Proposition 7.1. The immersed manifolds W + (A, Fµ) and W − (A, Fµ) intersect transversely<br />

along the annulus<br />

Iµ = {(τ, e, θ2, I2) ∈ E∗ × A | τ = e = 0}.<br />

Let us remark that this annulus is also given by<br />

in the original coordinates.<br />

Iµ = {(θ1, I1, θ2, I2) ∈ A 2 | θ1 = 1/2, I1 = 2}<br />

Proof. By definition of D, one easily sees that the sets<br />

W + (A, F0) ∩ F0( D), W − (A, F0) ∩ F −1<br />

0 ( D),<br />

are disjoint from D. Since Fµ and F0 coincide outside D, we can define pieces of stable<br />

and unstable manifolds by<br />

and<br />

Therefore<br />

W + ¯ θ (A, Fµ) = W + (A, F0) ∩ F0( D) ⊆ W + (A, Fµ)<br />

W − ¯ θ (A, Fµ) = W − (A, F0) ∩ F −1<br />

0 ( D) ⊆ W − (A, Fµ).<br />

F −1<br />

µ (W + ¯ θ (A, Fµ)) ⊆ W + (A, Fµ), Fµ(W − ¯ θ (A, Fµ)) ⊆ W − (A, Fµ). (55)<br />

Then on the one hand,<br />

F −1<br />

µ (W + ¯ θ (A, Fµ)) = F −1<br />

0 ◦ Φ−µf (W + ¯ θ (A, Fµ))<br />

= F −1<br />

0 (W + ¯ θ (A, Fµ))<br />

= W + (A, F0) ∩ D,


7.4 - Construction of a symbolic dynamic 159<br />

and on the other hand,<br />

Fµ(W − ¯ θ (A, Fµ)) = Φ µf ◦ F −1<br />

0 (W − ¯ θ (A, Fµ))<br />

Now in time-energy coordinates, one simply has<br />

and therefore<br />

= Φ µf (W − (A, F0) ∩ D).<br />

W ± (A, F0) = {(τ, e, θ2, I2) ∈ E∗ × A | e = 0},<br />

W + (A, F0) ∩ D = {(τ, e, θ2, I2) ∈ D∗ | e = 0},<br />

while, using the expression of the perturbation (54),<br />

Φ µf (W − (A, F0) ∩ D) = {(τ, e, θ2, I2) ∈ D∗ | e = µτf2(θ2)}.<br />

Since f2 is nowhere zero, one easily sees that the manifolds F −1<br />

µ (W + ¯ θ (A, Fµ)) and<br />

Fµ(W − ¯ θ (A, Fµ)) intersect transversely along the annulus<br />

and the conclusion follows by (55).<br />

Iµ = {(τ, e, θ2, I2) ∈ D∗ | τ = e = 0},<br />

7.4 Construction of a symbolic dynamic<br />

7.4.0.1. In this section, we will take advantage of the fact that our diffeomorphism<br />

Fµ : A 2 → A 2<br />

possesses an invariant normally hyperbolic annulus A, whose stable and unstable manifolds<br />

intersect transversely along a homoclinic annulus Iµ.<br />

In the case where the normally hyperbolic manifold is a point, under such a transverse<br />

homoclinic intersection it is well-known that chaotic dynamics arise: the system has an<br />

invariant Cantor set on which a suitable iterate is conjugated to a shift map (this is the<br />

Horseshoe theorem, due to Birkhoff, Smale and Alexeiev). In our more general situation,<br />

the symbolic dynamic is more complicated.<br />

7.4.0.2. Recall from definition 7.3 that a skew-product (over σ) is completely defined<br />

by the family of maps (F¯n)¯n∈ΣA : M → M, and the skew-product will be denoted by<br />

[[F¯n]]n∈ΣA . Then one can easily see that a sequence (nk, xk)k∈Z ∈ A × M gives rise to<br />

an orbit (σk (¯n), xk)k∈Z ∈ ΣA × M, where ¯n = (nk)k∈Z, for the skew-product [[F¯n]]n∈ΣA<br />

if and only if<br />

Fσk (¯n)(xk) = xk+1, k ∈ Z.<br />

In this section, we will need our alphabet A to contain integers of order ln µ −1 , but<br />

for subsequent arguments, we will also need it to contain integers n ∈ N as large as<br />

µ −1<br />

2 ln µ −1 , so we may already fix<br />

A = Aµ = {[ln µ −1 ], . . .,2[µ −1<br />

2 ln µ −1 ]}.


160 Optimal time of instability for a priori unstable Hamiltonian systems<br />

For simplicity, we shall get rid of the subscript µ.<br />

7.4.0.3. Let Oµ ⊆ D∗ be a neighbourhood (in time-energy coordinates) of the homoclinic<br />

annulus Iµ. For x ∈ Oµ, let us set<br />

and assuming n x 0<br />

n x k<br />

n x 0 = inf{n ∈ N∗ | F n µ (x) ∈ Oµ} ∈ N ∗ ∪ {+∞}<br />

< +∞, we can define inductively<br />

<br />

= inf n ∈ N ∗ | F n <br />

µ F nx <br />

k−1<br />

µ (x)<br />

∈ Oµ<br />

for k ≥ 1 provided nx k−1 < +∞. Similarly we can define<br />

and by induction<br />

if n x −k<br />

n x −k<br />

<br />

∈ N ∗ ∪ {+∞},<br />

n x −1 = inf{n ∈ N∗ | F −n<br />

µ (x) ∈ Oµ} ∈ N ∗ ∪ {+∞},<br />

= inf<br />

< +∞ for k ≥ 1.<br />

<br />

n ∈ N ∗ | F n <br />

µ<br />

F nx −k+1<br />

µ<br />

<br />

(x) ∈ Oµ ∈ N ∗ ∪ {+∞}<br />

We will show below that there exists Λµ ⊆ Oµ such that for any x ∈ Λµ, the<br />

doubly infinite sequence ¯n x = (n x k )k∈Z is a well-defined element of ΣA. In fact, Λµ is<br />

homeomorphic to a Cantor set of annuli, and we will be interested in the dynamics<br />

restricted to this set. For that, following Moser ([Mos73]) we define the transversal map<br />

˜Fµ(x) = F nx 0<br />

µ (x), x ∈ Oµ, n x 0 < +∞,<br />

which is the first return map to the neighbourhood Oµ, and if the sequence ¯n x is welldefined,<br />

then so is ˜ F n µ(x) for n ∈ Z.<br />

7.4.0.4. In the sequel, we will consider the discrete topology on the alphabet A, and<br />

the sets A N and ΣA = A Z will be endowed with the product topology, for which they are<br />

compact and metrizable. The goal of this section is to prove the following proposition.<br />

Proposition 7.2. For µ small enough, there exist a neighbourhood Oµ, a set Λµ ⊆ Oµ<br />

invariant by the transversal map ˜ Fµ, a homeomorphism<br />

Υµ : ΣA × A −→ Λµ<br />

(¯n, (θ, I)) ↦−→ (ψ¯n(θ, I), (θ, I))<br />

with ψ¯n : A → E∗, where E∗ is defined in (53), and a family of maps<br />

F¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos2π(θ + n0I)), ¯n ∈ ΣA, (θ, I) ∈ A,<br />

such that for x = (τx, ex, θx, Ix) ∈ Λµ, then ¯n x ∈ ΣA and<br />

˜Fµ ◦ Υµ(¯n x , θx, Ix) = Υµ(σ(¯n x ), F¯n x(θx, Ix)).<br />

Moreover the map φ¯n : A → R is Lipschitz and satisfies<br />

|φ¯n|C 0 (A) = O µ 2π−1 ,<br />

<br />

Lip(φ¯n) = O µ 2π−1 2 ln µ −1<br />

<br />

. (56)


7.4 - Construction of a symbolic dynamic 161<br />

The statement of the above proposition seems complicated, but in fact it simply<br />

means that the restriction of ˜ Fµ to the invariant set Λµ is conjugated to the skewproduct<br />

on A given by the maps<br />

F¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos 2π(θ + n0I)), ¯n ∈ ΣA, (θ, I) ∈ A.<br />

The proof of this result is long and technical. We will first prove an abstract result<br />

in section 7.4.1, which is contained in Proposition 7.4, and then we will apply this<br />

proposition to our example in section 7.4.2.<br />

7.4.1 Symbolic dynamic<br />

7.4.1.1. Here we will use the framework developed by Chaperon ([Cha04], [Cha08]),<br />

which was designed to obtain rather general invariant manifold theorems, in particular<br />

in the normally hyperbolic case.<br />

Consider a complete metric space X and a complete subspace Y of a metric space F.<br />

We endow the product space X ×F with the product metric, that is for (x, y), (x ′ , y ′ ) ∈<br />

X × F we define<br />

d((x, y), (x ′ , y ′ )) = sup{d(x, x ′ ), d(y, y ′ )}.<br />

Let us set Z = X × Y . Then we consider a Lipschitz map<br />

h = (f, g) : Z −→ X × F,<br />

and we make the following two assumptions.<br />

(H1) There exists a constant ρ −1<br />

0 > 0 such that for all x ∈ X, y, y′ ∈ Y<br />

Hence, for all x ∈ X, the map<br />

is a bijection.<br />

d(g(x, y), g(x, y ′ )) ≥ ρ −1<br />

0 d(y, y′ ).<br />

gx : Y −→ gx(Y )<br />

(H2) For all x ∈ X, we have Y ⊆ gx(Y ). Hence the map<br />

is well-defined, and if<br />

g −1<br />

x : Y −→ Y<br />

G : Z −→ Y<br />

is defined by G(x, y) = g−1 x (y), then G is Lipschitz.<br />

We will say that the map h satisfies hypo<strong>thesis</strong> (H) if it satisfies both hypotheses<br />

(H1) and (H2). Under these assumptions, we will consider three positive constants ν0, σ0<br />

and κ0 where ν0 is the Lipschitz constant of G with respect to Y , that is<br />

d(G(x, y), G(x, y ′ )) ≤ ν0d(y, y ′ ), ∀x ∈ X, ∀y, y ′ ∈ Y,


162 Optimal time of instability for a priori unstable Hamiltonian systems<br />

which from (H1) is smaller than ρ0, σ0 is the Lipschitz constant of G with respect to X,<br />

that is<br />

d(G(x, y), G(x ′ , y)) ≤ σ0d(x, x ′ ), ∀x ∈ X, ∀y, y ′ ∈ Y,<br />

and κ0 the Lipschitz constant of f. Let us now formulate another hypo<strong>thesis</strong>.<br />

(L) Assume that σ0 + ν0 max{κ0, 1} < 1. Hence for any x ∈ X, y ∈ Y the maps<br />

are contractions.<br />

G(., y) : X −→ Y, G(x, .) : Y −→ Y,<br />

The following result is due to Chaperon ([Cha04]).<br />

Theorem 7.5 (Chaperon). With the previous notations, assume that Y is bounded and<br />

the map<br />

h : Z −→ X × F<br />

satisfies hypotheses (H) and (L). Then the set<br />

Λ = <br />

h −n (Z)<br />

n∈N<br />

is the graph of a contracting map Φ : X → Y , with<br />

Lip(Φ) ≤ σ0(1 − ν0κ0) −1 < 1.<br />

7.4.1.2. The above theorem is concerned with the iteration of a single map h, but<br />

with this formalism we obviously have an analogous result for the iteration of a family<br />

of maps. More precisely, consider a family of maps<br />

hn = (fn, gn) : Z −→ X × F, n ∈ A,<br />

where A is the given alphabet. We will assume that each map satisfies hypo<strong>thesis</strong> (H)<br />

and a uniform version of hypo<strong>thesis</strong> (L), that is:<br />

(L’) sup n∈A {σn + νn max{κn, 1}} < 1.<br />

Then one can state the following proposition.<br />

Proposition 7.3. With the previous notations, assume that Y is bounded and that for<br />

any n ∈ A the maps<br />

hn : Z −→ X × F<br />

satisfy hypotheses (H) and (L’). Then for any sequence ¯n+ = (nk)k∈N ∈ AN , the set<br />

Λ + (¯n+) = <br />

(hnk ◦ · · · ◦ hn0) −1 (Z)<br />

k∈N<br />

is the graph of a contracting map Φ¯n+ : X → Y , with<br />

Lip(Φ¯n+) ≤ sup{σn(1<br />

− νnκn)<br />

n∈A<br />

−1 } < 1.<br />

Moreover, if we endow the space of continuous function C(X, Y ) with the topology of<br />

pointwise convergence, then the map<br />

is continuous.<br />

¯n+ ∈ A N ↦−→ Φ¯n+ ∈ C(X, Y )


7.4 - Construction of a symbolic dynamic 163<br />

The proof is the same as in Theorem 7.5, with some obvious modifications.<br />

7.4.1.3. Now we consider two metric spaces F + and F − , and two complete subspaces<br />

X ⊆ F − and Y ⊆ F + . Let V another complete metric space, and<br />

Z = X × Y × V.<br />

One has to think of V as a central direction and F − (resp. F + ) as a contracting (resp.<br />

expanding) direction. We consider two families of maps (h + n)n∈A and (h − n)n∈A of the<br />

form<br />

h + n : Z −→ X × F + × V, h − n : Z −→ F − × Y × V,<br />

and we decompose them as h ± n = (f ± n , g± n ), where<br />

and<br />

The hypotheses (H) and (L’) for h ± n<br />

by<br />

f + n : Z −→ X × V, f − n : Z −→ Y × V<br />

g + n : Z −→ F + , g − n : Z −→ F − .<br />

refer to these decompositions. Let us also denote<br />

F + n : Z −→ V<br />

the second component of the map f + n , and consider sets Z + n ⊆ Z and Z− n ⊆ Z such that<br />

for any n ∈ A.<br />

Z ∩ (h + n )−1 (Z) ⊆ Z + n , Z ∩ (h− n )−1 (Z) ⊆ Z − n ,<br />

7.4.1.4. The aim of this section is to prove the following result.<br />

Proposition 7.4. With the previous notations, assume that X, Y are bounded, V is<br />

locally compact and that for any n ∈ A the maps<br />

h + n : Z −→ X × F + × V, h − n : Z −→ F − × Y × V<br />

satisfy hypotheses (H) and (L’). Let us also assume that for any n ∈ A,<br />

(i) the maps (h ± n ) |Z ± n are invertible and<br />

<br />

(h + n ) |Z + −1 = (h n<br />

− n ) |Z + n ,<br />

(ii) the sets Z + n (resp. Z− n ) are pairwise disjoint.<br />

<br />

(h − n ) |Z − −1 = (h n<br />

+ n ) |Z − n ;<br />

Then for any sequence ¯n ∈ ΣA, the map h + n0 has an invariant set Λ and there exist a<br />

homeomorphism<br />

Υ : ΣA × V −→ Λ<br />

(¯n, v) ↦−→ (Φ¯n(v), v)<br />

which conjugates h +<br />

n0|Λ<br />

to the skew-product on V given by<br />

F¯n(v) = F + n0 (Φ¯n(v), v), ¯n ∈ ΣA, v ∈ V,<br />

where Φ¯n : V → X × Y is a contracting map, with<br />

Lip(Φ¯n) ≤ sup{(σ<br />

n∈A<br />

+ n (1 − ν + n κ + n) −1 ), (σ − n (1 − ν − n κ − n) −1 )} < 1.


164 Optimal time of instability for a priori unstable Hamiltonian systems<br />

As a consequence, the functions F¯n : V → V , for ¯n ∈ ΣA, are also defined by the<br />

following equation<br />

h + n0 (Φ¯n(v), v) = (Φ¯n(F¯n(v)), F¯n(v)), v ∈ V. (57)<br />

This remark will be useful later on to obtain the estimates (56) in Proposition 7.4.<br />

Proof. For ¯n ∈ ΣA, let us define<br />

¯n+ = (nk)k∈N ∈ A N , ¯n− = (n−k)k∈N∗ ∈ AN∗.<br />

Since each family of maps (h + n)n∈A and (h − n)n∈A satisfies hypotheses (H) and (L’), we<br />

can apply Proposition 7.3 so both sets<br />

and<br />

are graphs of contracting maps<br />

that is<br />

and<br />

Moreover, one has<br />

Λ + (¯n+) = <br />

k∈N<br />

Λ − (¯n−) = <br />

k∈N ∗<br />

Φ + ¯n+ : X × V −→ Y, Φ− ¯n−<br />

(h + nk ◦ · · · ◦ h+ n0 )−1 (Z)<br />

(h − n−k ◦ · · · ◦ h− n−1 )−1 (Z)<br />

: Y × V −→ X,<br />

Λ + (¯n+) = {(x, Φ + (x, v), v) | x ∈ X, v ∈ V }<br />

¯n+<br />

Λ − (¯n−) = {(Φ − (y, v), y, v) | y ∈ Y, v ∈ V }.<br />

¯n−<br />

Lip(Φ + ¯n+<br />

Lip(Φ − ¯n−<br />

Therefore for each v ∈ V , the maps<br />

) ≤ sup{σ<br />

n∈A<br />

+ n (1 − ν+ n κ+ n )−1 } < 1, (58)<br />

) ≤ sup{σ<br />

n∈A<br />

− n (1 − ν− n κ−n )−1 } < 1. (59)<br />

Φ + ¯n+,v = Φ + ¯n+ (., v) : X −→ Y, Φ−¯n−,v = Φ − ¯n− (., v) : Y −→ X,<br />

are also contracting, and so are the maps<br />

Φ − ¯n−,v ◦ Φ+ ¯n+,v : X −→ X, Φ+ ¯n+,v ◦ Φ− ¯n−,v<br />

: Y −→ Y.<br />

Since X and Y are complete, these maps have fixed points x(¯n, v) ∈ X and y(¯n, v) ∈ Y<br />

from the contraction principle, and by uniqueness they satisfy<br />

Now let us define<br />

Φ + ¯n+ (x(¯n, v), v) = y(¯n, v), Φ− (y(¯n, v), v) = x(¯n, v).<br />

¯n−<br />

Λ(¯n) = Λ + (¯n+) ∩ Λ − (¯n−).<br />

Then by the previous relation this set is non-empty since it is the graph of the contraction<br />

Φ¯n : V −→ X × Y<br />

v ↦−→ (x(¯n, v), y(¯n, v)),


7.4 - Construction of a symbolic dynamic 165<br />

and, from (58) and (59),<br />

Moreover, as the maps<br />

Lip(Φ¯n) ≤ sup{(σ<br />

n∈A<br />

+ n (1 − ν+ n κ+ n )−1 ), (σ − n (1 − ν− n κ−n )−1 )} < 1.<br />

¯n+ ∈ A N ↦→ Φ + ¯n+ ∈ C(X × V, Y ), ¯n− ∈ A N∗<br />

↦→ Φ − ¯n−<br />

∈ C(Y × V, X),<br />

are continuous, from the contraction principle one also has the continuity of the map<br />

Now set<br />

¯n ∈ ΣA ↦−→ Φ¯n(v) ∈ X × Y.<br />

Λ = <br />

¯n∈ΣA<br />

Λ(¯n).<br />

The fact that this set is invariant under h + will follow from our condition (i). Indeed,<br />

n0<br />

if z ∈ Λ, then z ∈ Λ(¯n) for some ¯n ∈ ΣA and so by definition<br />

z ∈ <br />

(h + nk ◦ · · · ◦ h+ n0 )−1 (Z), z ∈ <br />

(h − n−k ◦ · · · ◦ h−n−1 )−1 (Z).<br />

k∈N<br />

From the first relation we get<br />

and since by hypo<strong>thesis</strong><br />

k∈N ∗<br />

h + <br />

n0 (z) ∈<br />

k∈N∗ (h + nk ◦ · · · ◦ h+ n1 )−1 (Z) = Λ + (σ(¯n)+), (60)<br />

<br />

(h + n0 ) |Z + −1 = (h n0 − n0 ) |Z + n0 and Λ(¯n) ⊆ Z + n0 , we get from the second relation<br />

h + <br />

(z) ∈ n0<br />

k∈N<br />

(h − n−k ◦ · · · ◦ h− n0 )−1 (Z) = Λ − (σ(¯n)−). (61)<br />

Now (60) and (61) means exactly that h + n0 (z) ∈ Λ(σ(¯n)), so Λ is positively invariant<br />

under h + . In fact, a completely similar argument using condition (i) shows that<br />

n0<br />

h + (Λ(¯n)) = Λ(σ(¯n)),<br />

n0<br />

and hence Λ is totally invariant under h + . More generally, one obtains<br />

n0<br />

h ± nk−1 ◦ · · · ◦ h± n0 (Λ(¯n)) = Λ(σ±k (¯n)), k ∈ Z.<br />

Next let us prove that as a consequence of (ii) the union<br />

Λ = <br />

Λ(¯n)<br />

¯n∈ΣA<br />

is disjoint. Let ¯n = ¯n ′ , so there exists l ∈ Z such that nl = n ′ l . First suppose that l = 0,<br />

then on the one hand<br />

Λ(¯n) ⊆ Λ(¯n+) ⊆ Z + n0


166 Optimal time of instability for a priori unstable Hamiltonian systems<br />

and on the other hand<br />

Λ(¯n ′ ) ⊆ Λ(¯n ′ +<br />

) ⊆ Z+ .<br />

As Z + and Z+ n0 n ′ are disjoint by hypo<strong>thesis</strong>, then so are Λ(¯n) and Λ(¯n<br />

0<br />

′ ). Now if l ≥ 1,<br />

we can assume without loss of generality that nk = n ′ k for 0 ≤ k ≤ l − 1, and as before<br />

and<br />

h + nl−1 ◦ · · · ◦ h+ n0 (Λ(¯n)) = Λ(σl (¯n)) ⊆ Λ(σ l (¯n)+) ⊆ Z + nl<br />

h + nl−1 ◦ · · · ◦ h+ n0 (Λ(¯n′ )) = Λ(σ l (¯n ′ )) ⊆ Λ(σ l (¯n ′ )+) ⊆ Z +<br />

n ′ ,<br />

l<br />

so Λ(¯n) and Λ(¯n ′ ) have to be disjoint. Finally, the case l ≤ −1 is completely similar<br />

using the hypo<strong>thesis</strong> that Z− n are pairwise disjoint for n ∈ A.<br />

To conclude, as Λ is a disjoint union, every point z ∈ Λ can be uniquely written as<br />

z = (Φ¯n(v), v) for v ∈ V and ¯n ∈ ΣA, so the map<br />

n ′ 0<br />

Υ : ΣA × V −→ Λ<br />

(¯n, v) ↦−→ (Φ¯n(v), v)<br />

is a well-defined continuous bijection, and as V is locally compact, one can check that<br />

it is a homeomorphism with respect to the product topology on ΣA × V . Then for<br />

z = (Φ¯n(v), v) ∈ Λ(¯n), as h + n0 (z) ∈ Λ(σ(¯n)) we have from the definition of F + n ,<br />

h + n0 (z) = Φσ(¯n)(F + n0 (z)), F + n0 (z)<br />

The last equality exactly means that the map<br />

is conjugated by Υ to the skew-product<br />

= Φσ(¯n)(F + n0 (Φ¯n(v), v)), F + n0 (Φ¯n(v), v) .<br />

h + : Λ −→ Λ<br />

n0<br />

G : ΣA × V −→ ΣA × V<br />

(¯n, v) ↦−→ (σ(¯n), F(¯n, v)).<br />

where F(¯n, v) = F + n0 (Φ¯n(v), v). This ends the proof.<br />

7.4.2 Proof of Proposition 7.2<br />

This section is entirely devoted to the proof of Proposition 7.2, this will be done in several<br />

steps. We will have to use notations and estimates on time-energy coordinates, contained<br />

in Appendix 7.A, and this will require to choose µ sufficiently small. Moreover, we shall<br />

use various coordinates but we shall keep the same notation for the diffeomorphisms<br />

expressed in different coordinates.<br />

Step 1. Straightening of the invariant manifolds.<br />

Using time-energy coordinates on the first factor, our homoclinic annulus is given by<br />

Iµ = {(τ, e, θ2, I2) ∈ D∗ | τ = e = 0}


7.4 - Construction of a symbolic dynamic 167<br />

and in a neighbourhood of it, the stable manifold of Fµ is given by<br />

while the unstable manifold is<br />

{(τ, e, θ2, I2) ∈ D∗ | e = 0}<br />

{(τ, e, θ2, I2) ∈ D∗ | e = µf2(θ2)τ}.<br />

Now we introduce the change of coordinates<br />

where<br />

Its inverse is given by<br />

Θ : E∗ × A −→ E ′ ∗ × A<br />

(τ, e, θ2, I2) ↦−→ (τ ′ , e ′ , θ2, I2)<br />

τ ′ = τ − (µf2(θ2)) −1 e, e ′ = (µf2(θ2)) −1 e.<br />

Θ −1 : E ′ ∗ × A −→ E∗ × A<br />

(τ ′ , e ′ , θ2, I2) ↦−→ (τ ′ + e ′ , µf2(θ2)e ′ , θ2, I2).<br />

It follows that in these new coordinates, denoting by D ′ ∗ = Θ( D∗), the stable manifold<br />

and the unstable manifold<br />

are straightened out.<br />

Step 2. Choice of the box Z.<br />

{(τ ′ , e ′ , θ2, I2) ∈ D ′ ∗ | e′ = 0}<br />

{(τ ′ , e ′ , θ2, I2) ∈ D ′ ∗ | τ ′ = 0}<br />

Our goal is to use Proposition 7.4, so we will explain how to choose the domain Z.<br />

Our central direction V = A will be the annulus given by the coordinates (θ2, I2), our<br />

contracting and expanding directions will be one-dimensional, so X ⊆ F − = R and<br />

Y ⊆ F + = R. We will choose<br />

X = [−τ ′ 0 , τ ′ 0 ], Y = [0, e′ 0 ],<br />

with τ ′ 0 = c1µ 2π−1 , e ′ 0 = c2µ 2π−1 , and c1 > 2π, c2 > 2π. Eventually<br />

Z = [−τ ′ 0, τ ′ 0] × [0, e ′ 0] × A.<br />

For (θ2, I2) ∈ A, let us also define the section<br />

Z(θ2, I2) = [−τ ′ 0, τ ′ 0] × [0, e ′ 0] × {θ2, I2},<br />

where all the constructions will take place (see figure 2).<br />

This domain Z is located above the homoclinic annulus Iµ, and for µ small enough<br />

it is contained in the domain D ′ ∗ where the manifolds are straightened.


168 Optimal time of instability for a priori unstable Hamiltonian systems<br />

µf2(θ2)e ′ 0<br />

e<br />

−τ ′ 0 τ ′ 0<br />

e = µf2(θ2)τ<br />

Figure 2: Section Z(θ2, I2) of the domain Z<br />

Z(θ2, I2)<br />

For n ∈ A = {[ln µ −1 ], . . .,2[µ −1<br />

2 ln µ −1 ]}, the point an = (0, en) ∈ D∗, or a ′ n =<br />

(−e ′ n, e ′ n) ∈ D ′ ∗, is by definition n-periodic for the pendulum map ΦP . We want the<br />

annulus {a ′ n } × A, which is n-periodic for the unperturbed map F0, to be included in<br />

our domain Z, and this requires that e ′ n < e′ 0 and −τ ′ 0 < −e′ n for any n ∈ A. First note<br />

that<br />

e0 = µf2(θ2)e ′ 0 = c2f2(θ2)µ 2π ,<br />

and as f2(θ2) ≥ π −1 , c2f2(θ2) > 2 so e0 > 2µ 2π . Then using (73) and the fact that<br />

A = {[ln µ −1 ], . . ., 2[µ −1<br />

2 ln µ −1 ]}, for µ is small enough (and hence n is large enough)<br />

one obtains<br />

sup{en}<br />

< 2µ<br />

n∈A<br />

2π ,<br />

and therefore en < e0 is satisfied for any n ∈ A, which gives e ′ n < e′ 0 . Similarly, we<br />

obtain<br />

−τ ′ 0 < −2πµ 2π−1 < −e ′ n.<br />

Step 3. Construction of the domains Z + n .<br />

For n ∈ A, the domains Z + n ∩ Z(θ2, I2) will be rectangles (see figure 3, they are<br />

parallelograms in the coordinates (τ, e)), with vertices<br />

A 1 n =<br />

<br />

−τ ′ 0 , en − δe n<br />

µf2(θ2) , θ2,<br />

<br />

I2 , A 2 n =<br />

<br />

−τ ′ 0 , en + δe n<br />

µf2(θ2) , θ2,<br />

<br />

I2 ,<br />

A 3 n =<br />

where the parameter δ e n<br />

<br />

τ ′ 0, en + δ e n<br />

µf2(θ2) , θ2, I2<br />

is defined by<br />

δ e n<br />

<br />

, A 4 <br />

n = τ ′ 0, en − δe n<br />

µf2(θ2) , θ2, I2<br />

2π−1 c3µ<br />

=<br />

|T ′ n |<br />

, n ∈ A<br />

for some constant c3 > max{c1 + c2, c1 + 2π}, and T ′ n = T ′ (en) (see Appendix 7.A<br />

and (74)). Hence the domains Z + n are small neighbourhoods of the annuli {a′ n } × A, for<br />

an = (−e ′ n , e′ n ) ∈ D ′ ∗ .<br />

<br />

,<br />

τ


7.4 - Construction of a symbolic dynamic 169<br />

A 2 n<br />

A 1 n<br />

e<br />

A 3 n<br />

A 4 n<br />

Z + n<br />

Figure 3: Section Z + n (θ2, I2) of the domain Z + n<br />

For µ small enough, one can easily check from the definition of δ e n<br />

value theorem that<br />

and<br />

n + c3µ 2π−1 ≤ T(en − δ e 2π−1<br />

n ) ≤ n + 2c3µ<br />

τ<br />

and the mean<br />

(62)<br />

n − 2c3µ 2π−1 ≤ T(en + δ e n ) ≤ n − c3µ 2π−1 . (63)<br />

Let us show that these domains Z + n<br />

to prove that e ′ (A1 n) > e ′ (A2 n+1). Choosing µ small so<br />

from (62) and (63) we get<br />

are pairwise disjoint. By construction it is enough<br />

n + 2c3µ 2π−1 < n + 1 − 2c3µ 2π−1 ,<br />

T(en − δ e n) < T(en+1 + δ e n+1).<br />

But as the period function T is decreasing, this gives<br />

which implies that e ′ (A 1 n) > e ′ (A 2 n+1).<br />

en − δ e n > en+1 + δ e n+1<br />

Let us also remark that by definition of our domain D ′ ∗ , one can ensure that for µ<br />

small enough (and so n − T(e) is small enough)<br />

F k 0 (Z+ n ) ∩ D′ ∗<br />

Step 4. Construction of the domains Z − n .<br />

= ∅, 0 < k < n. (64)<br />

The construction is similar (see figure 4), namely Z− n ∩ Z(θ2, I2) is a rectangle with<br />

vertices<br />

B 1 n =<br />

<br />

− en<br />

µf2(θ2) − δτ n, e ′ <br />

0, θ2, I2 , B 2 <br />

n = − en<br />

µf2(θ2) + δτ n, e ′ <br />

0, θ2, I2 ,


170 Optimal time of instability for a priori unstable Hamiltonian systems<br />

B 3 n =<br />

Z − n<br />

where δ τ n is defined by<br />

B 4 n<br />

e<br />

B 3 n<br />

B 1 n B 2 n<br />

Figure 4: Section Z − n (θ2, I2) of the domain Z − n<br />

<br />

− en<br />

µf2(θ2) + δτ <br />

n, 0, θ2, I2 , B 2 <br />

n = − en<br />

µf2(θ2) − δτ <br />

n, 0, θ2, I2 ,<br />

δ τ n<br />

2π−2 c4µ<br />

=<br />

|T ′ n |<br />

,<br />

for a constant c4 > π max{c1+4π, c1+c2}, and T ′ n = T ′ (en) (see Appendix 7.A and (74)).<br />

As in the previous step, one can check that τ ′ (B 1 n+1 ) > τ ′ (B 2 n ) so the domains Z− n are<br />

pairwise disjoint.<br />

Step 5. Expressions of the maps F ±n<br />

µ restricted to Z ± n .<br />

For any (τ, e, θ2, I2) ∈ Θ −1 (Z + n ), one has the following explicit expression for the<br />

unperturbed map<br />

F n 0 (τ, e, θ2, I2) = (τ + n − T(e), e, θ2 + nI2, I2).<br />

In those coordinates, our perturbation is given by<br />

Φ µf (τ, e, θ2, I2) = (τ, e + µτf2(θ2), θ2, I2 − µf1(τ)f ′ 2(θ2)).<br />

Then, from (64) we know that F n µ = Φ µf ◦ F n 0 when restricted to Θ −1 (Z + n ), so<br />

F n µ(τ, e, θ2, I2) = (τ + n − T(e), e + µ(τ + n − T(e))f2(θ2 + nI2),<br />

θ2 + nI2, I2 − µf1(τ + n − T(e))f ′ 2 (θ2 + nI2)).<br />

Using the expression for Θ −1 , for (τ ′ , e ′ , θ2, I2) ∈ Z + n we compute<br />

F n µ(τ ′ , e ′ , θ2, I2) = (−(f2(θ2 + nI2)) −1 f2(θ2)e ′ ,<br />

(1 + f2(θ2 + nI2)) −1 f2(θ2))e ′ + τ ′ + n − T(µf2(θ2)e ′ ),<br />

θ2 + nI2,<br />

I2 − µf1(τ ′ + e ′ + n − T(µf2(θ2)e ′ ))f ′ 2 (θ2 + nI2)).<br />

If we fix (θ2, I2) ∈ A, then the first two components of this map are linear with respect<br />

to τ ′ and e ′ , therefore the image of the parallelogram Z + n by the map Fn µ is still a<br />

τ


7.4 - Construction of a symbolic dynamic 171<br />

A 2 n<br />

A 1 n<br />

e<br />

F n µ (Z+ n )<br />

à 1 n à 4 n<br />

à 2 n<br />

à 3 n<br />

A 3 n<br />

A 4 n<br />

Z + n<br />

Figure 5: Position of Z + n and Fn µ (Z+ n )<br />

parallelogram (in a different section), with vertices Ãi n = F n µ(A i n) for i = 1, 2, 3, 4 which<br />

can be explicitly computed.<br />

Similarly, for (τ ′ , e ′ , θ2, I2) ∈ Z − n we compute<br />

F −n<br />

µ (τ ′ , e ′ , θ2, I2) = ((1 + (f2( ¯ θ2)) −1 f2(θ2))τ ′ + e ′ − k + T(−µf2(θ2)τ ′ ),<br />

(f2( ¯ θ2)) −1 f2(θ2)τ ′ ,<br />

θ2 + nI2 − nµf1(τ ′ + e ′ )f ′ 2 (θ2),<br />

I2 + µf1(τ ′ + e ′ )f ′ 2 (θ2)),<br />

so the image of Z − n by the map F −n<br />

µ is a parallelogram with vertices ˜ B i n = F −n<br />

µ (B i n) for<br />

i = 1, 2, 3, 4.<br />

Step 6. Relative position of Z ± n<br />

and F ±n<br />

µ (Z± n ).<br />

Here we will prove that the figures (5) and (6) make sense, that is we will show that<br />

the horizontal (resp. vertical) edges of Fn µ (Z+ −n<br />

n ) (resp. Fµ (Z− n )) are not contained in<br />

Z. The upper horizontal edge of Z + n is the segment joining A2n to A3n , therefore the<br />

upper horizontal edge of Fn µ (Z+ n ) is the segment joining Ã2n to Ã3n . We can compute<br />

e ′ ( Ã2 <br />

<br />

1 1<br />

n) =<br />

+ µ<br />

f2(θ2 + nI2) f2(θ2)<br />

−1 en − τ ′ 0 + n − T(en + δ e n)<br />

and<br />

e ′ ( Ã3 <br />

n ) =<br />

1<br />

f2(θ2 + nI2)<br />

<br />

1<br />

+ µ<br />

f2(θ2)<br />

−1 en + τ ′ 0 + n − T(en + δ e n ).<br />

From (63) we have T(en + δ e n ) ≤ n − c3µ 2π−1 , and as τ ′ 0 = c1µ 2π−1 and c3 > c1 + c2 this<br />

gives<br />

We also have<br />

e ′ ( Ã2 n ) > −τ ′ 0 + n − T(en + δ e n )<br />

> (c3 − c1)µ 2π−1 > c2µ 2π−1 > e ′ 0 .<br />

e ′ ( Ã3 n ) = e′ ( Ã2 n ) + 2τ ′ 0 > e′ 0 ,<br />

τ


172 Optimal time of instability for a priori unstable Hamiltonian systems<br />

˜B 4 n<br />

˜B 1 n<br />

Z− n<br />

B 4 n<br />

e<br />

B 3 n<br />

Figure 6: Position of Z − n<br />

B 1 n B 2 n F −n<br />

µ (Z− n )<br />

˜B 3 n<br />

˜B 2 n<br />

and F −n<br />

µ (Z− n )<br />

and so the upper horizontal edge of Z + n is not contained in Z.<br />

For the lower horizontal edge of Fn µ (Z+ n ), which is the segment joining Ã4n to Ã1 n , one<br />

has to prove that e ′ ( Ã4n ) < 0 and e′ ( Ã1n ) < 0. We compute<br />

e ′ ( Ã4 <br />

<br />

1 1<br />

n ) =<br />

+ µ<br />

f2(θ2 + nI2) f2(θ2)<br />

−1 en + τ ′ 0 + n − T(en − δ e n )<br />

≤ 2πµ −1 en + (c1 − 2c3)µ 2π−1<br />

≤ (c1 − 2c3 + 2π)µ 2π−1 ,<br />

and as 2c3 > c3 > c1 + 2π, this gives e ′ ( Ã4 n) < 0. We also have<br />

e ′ ( Ã1 n ) = e′ ( Ã4 n ) − 2τ ′ 0<br />

< 0,<br />

and so the lower horizontal edge of Z + n is not contained in Z.<br />

Similarly, one can check that the vertical edges of F −n<br />

µ (Z− n ) are not contained in Z,<br />

and this follows from the choice of the constant c4.<br />

set<br />

Step 7. Definition of the maps h ± n .<br />

Our maps h + n and h− n will be suitable extensions of the maps Fn µ<br />

(h + n) |Z + n = (F n µ) |Z + n , (h − n) |Z − n = (F −n<br />

µ ) |Z − n ,<br />

and we want to define Lipschitz extensions of h ± n<br />

Z ∩ (h ± n )−1 (Z) ⊆ Z ± n .<br />

to Z in order to have<br />

τ<br />

−n and Fµ . First we<br />

Let us begin with the maps h + n. Take z = (τ ′ , e ′ , θ2, I2) ∈ Z \ Z + n , then we can find<br />

a unique ˜z = (τ ′ , e ′ (˜z), θ2, I2) ∈ Z + n : indeed, either<br />

en + δ e n<br />

µf2(θ2) < e′ < e ′ 0


7.4 - Construction of a symbolic dynamic 173<br />

in which case we choose<br />

or<br />

and then we choose<br />

Then we set<br />

for some positive constant α + n<br />

is a well-defined Lipschitz extension of F n µ<br />

check that Z ∩ (h + n) −1 (Z) ⊆ Z + n .<br />

e ′ (˜z) = en + δ e n<br />

µf2(θ2) ,<br />

0 < e ′ < en − δ e n<br />

µf2(θ2)<br />

e ′ (˜z) = en − δ e n<br />

µf2(θ2) .<br />

h + n (z) = Fn µ (˜z) + (0, α+ n (e′ (z) − e ′ (˜z)), 0, 0),<br />

yet to be chosen. Hence the map<br />

h + n : Z −→ [−τ ′ 0 , τ ′ 0 ] × F + × A<br />

and, using the form of the extension, one can<br />

For the maps h− n , this is completely analogous. For z = (τ ′ , e ′ , θ2, I2) ∈ Z \ Z− n<br />

we can find a unique ˜z = (τ ′ (˜z), e ′ , θ2, I2) ∈ Z− n where<br />

if<br />

or<br />

if<br />

Then we define<br />

for some constant positive α− n . The map<br />

τ ′ (˜z) = − en<br />

µf2(θ2) + δτ n<br />

− en<br />

µf2(θ2) + δτ n < τ ′ < τ ′ 0,<br />

τ ′ (˜z) = − en<br />

µf2(θ2) − δτ n<br />

−τ ′ 0 < τ ′ < − en<br />

µf2(θ2) + δτ n .<br />

h − n(z) = F −n<br />

µ (˜z) + (α − n(τ ′ (z) − τ ′ (˜z)), 0, 0, 0)<br />

h + n : Z −→ F − × [0, e ′ 0] × A<br />

is a well-defined Lipschitz extension of F −n<br />

µ , with Z ∩ (h − n) −1 (Z) ⊆ Z − n .<br />

Step 8. Verification of Hypotheses (H) and (L’).<br />

, then<br />

Now let us show that the maps h + n and h − n satisfy hypotheses (H) and (L’). Recall<br />

that<br />

X = [−τ ′ 0, τ ′ 0] ⊆ F − = R, Y = [0, e ′ 0] ⊆ F + = R,<br />

and let us write h + n = (f+ n , g+ n ) where<br />

f + n<br />

: X × Y × A −→ X × A


174 Optimal time of instability for a priori unstable Hamiltonian systems<br />

and<br />

For x ∈ X × A, consider the partial map<br />

g + n : X × Y × A −→ F + .<br />

g + n,x : Y −→ F + .<br />

By step 6 (see the figure 5), the image under (h + n ) |Z + n = (Fn µ ) |Z + n<br />

of Z + n do not belong to Z, and this implies that<br />

so the hypo<strong>thesis</strong> (H2) is satisfied.<br />

Y = [0, e ′ 0 ] ⊆ g+ n,x (Y ∩ Z+ n ) ⊆ g+ n,x (Y ),<br />

of the horizontal edges<br />

Now it remains to show hypo<strong>thesis</strong> (H1) and (L’). This follows from the choice of α + n<br />

and lengthy calculations of the various partial derivatives of Fn µ , using both the explicit<br />

expression obtained in step 5 and the estimates of Appendix 7.A. We find the following:<br />

if we define<br />

α + n = π−1 (µ|T ′ n |),<br />

then (H1) is satisfied with<br />

and we estimate the Lipschitz constants<br />

ρ + n = O (µ|T ′ n |) ,<br />

ν + n = O µ −1 |T ′ n |−1 , σ + n = O µ 2π−1 , κ + n = O µ 2π−1 |T ′ n | ,<br />

so (L’) is satisfied for µ small enough as<br />

σ + n + ν+ n max{1, κ+ n } = σ+ n + ν+ n κ+ n = O µ 2π−1 .<br />

The situation for h − n is of course similar.<br />

Step 9. Conclusions.<br />

All the hypotheses of Proposition 7.4 are satisfied, so we can apply it to any sequence<br />

¯n ∈ A: there exist a set Λ ⊆ Z, invariant by h + , and a homeomorphism<br />

n0<br />

Υ : ΣA × A −→ Λ<br />

(¯n, (θ, I)) ↦−→ (Φ¯n(θ, I), (θ, I))<br />

which conjugates h +<br />

n0|Λ to the skew-product on A given by<br />

F¯n(θ, I) = F + n0 (Φ¯n(θ, I), θ, I), (θ, I) ∈ A,<br />

where Φ¯n : A → X × Y is a contracting map, with<br />

Lip(Φ¯n) ≤ sup{(σ<br />

n∈A<br />

+ n (1 − ν+ n κ+ n )−1 ), (σ − n (1 − ν− n κ−n )−1 )} = O µ 2π−1 . (65)<br />

Recall also from (57) that<br />

So we let<br />

h + n0 (Φ¯n(θ, I), θ, I) = (Φ¯n(F¯n(θ, I)), F¯n(θ, I)), (θ, I) ∈ A. (66)<br />

Oµ = Θ −1 (Z), Λµ = Θ −1 (Λ) ⊆ Oµ, Υµ = Θ −1 ◦ Υ.


7.4 - Construction of a symbolic dynamic 175<br />

For x ∈ Λµ,<br />

h +<br />

n x 0 (x) = Fnx 0<br />

µ (x) = ˜ Fµ(x),<br />

where ˜ Fµ is the transversal map of Fµ associated to Oµ. Then by definition of Λµ, the<br />

sequence ¯n x is a well-defined element of ΣA. Setting x = (τx, ex, θx, Ix) ∈ Λµ, the above<br />

conjugacy gives<br />

˜Fµ ◦ Υµ(¯n x , θx, Ix) = Υµ(σ(¯n x ), F¯n x(θx, Ix)).<br />

Now it remains to explain the form of F¯n and the estimates (56). We can write<br />

as<br />

Υµ : ΣA × A −→ Λµ<br />

Υµ(¯n, (θ, I)) = (ψ¯n(θ, I), (θ, I)) = (τ¯n(θ, I), e¯n(θ, I), θ, I),<br />

where τ¯n : A → R and e¯n : A → R are easily deduced from<br />

Φ¯n = (Φ 1 ¯n, Φ 2 ¯n) : A −→ X × Y<br />

and from Θ −1 . Now by definition of X and Y ,<br />

and this gives<br />

|Φ 1 ¯n|C 0 (A) = O µ 2π−1 , |Φ 2 ¯n|C 0 (A) = O µ 2π−1 ,<br />

|τ¯n|C 0 (A) = O µ 2π−1 , |e¯n|C 0 (A) = O µ 2π .<br />

Moreover using (65) these maps are Lipschitz and<br />

Now from (66) we can write<br />

Lip(τ¯n) = O µ 2π−1 , Lip(e¯n) = O µ 2π−1 .<br />

F¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos 2π(θ + n0I)), ¯n ∈ ΣA, (θ, I) ∈ A,<br />

where the Lipschitz function φ¯n : A → R satisfies the equation<br />

The estimate<br />

φ¯n(θ, I) = τσ(¯n)(θ + n0I, I + 2µf1(φ¯n(θ, I)) cos 2π(θ + n0I)). (67)<br />

|φ¯n| C 0 (A) = O µ 2π−1<br />

is obvious, but for the Lipschitz constant this is more subtle. Let us define<br />

g¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos 2π(θ + n0I)).<br />

Then on the one hand, using the definition of f1 we can estimate<br />

but on the other hand, from (67)<br />

so<br />

and therefore<br />

This concludes the proof.<br />

Lip(g¯n) ≤ sup{1 + µ −1<br />

2 ln µ −1 , Lip(φ¯n)},<br />

Lip(φ¯n) ≤ Lip(τ¯n)Lip(g¯n) < Lip(g¯n),<br />

<br />

Lip(g¯n) = O µ −1<br />

2 ln µ −1<br />

<br />

<br />

Lip(φ¯n) = O µ 2π−1 2 ln µ −1<br />

<br />

.


176 Optimal time of instability for a priori unstable Hamiltonian systems<br />

7.5 Construction of a pseudo-orbit<br />

7.5.0.1. In this section, we will restrict our study to a special type of skew-product<br />

over a Bernoulli shift, which is called "polysystem" (see [Mar08], it is also known as<br />

Iterated Function System).<br />

Recall from definition 7.4 that a polysystem is a skew-product (over σ) such that<br />

for any ¯n = (nk)k∈Z ∈ ΣA, one has F¯n = fn0. So a polysystem does not depend on the<br />

whole sequence ¯n ∈ ΣA but only on its first component n0 ∈ A, hence instead of being<br />

defined by a family of maps indexed by ΣA, it is defined by a family of maps indexed by<br />

A. Then one can easily see that a sequence (nk, xk)k∈Z ∈ A × M gives rise to an orbit<br />

(σ k (¯n), xk)k∈Z ∈ ΣA × M, where ¯n = (nk)k∈Z, for the polysystem [[fn]]n∈A if and only if<br />

fnk (xk) = xk+1, k ∈ Z,<br />

and its projection onto M corresponds to the iteration of the maps (fn)n∈A in the order<br />

prescribed by the sequence ¯n.<br />

7.5.0.2. In the previous section, we showed that the dynamics of the first return map<br />

of our diffeomorphism Fµ in a neighbourhood of the homoclinic annulus is conjugated<br />

to the skew-product map<br />

F¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos2π(θ + n0I)), ¯n ∈ ΣA, (θ, I) ∈ A.<br />

This family of maps depend on the whole sequence ¯n ∈ ΣA, but as φ¯n is small,<br />

f1(φ¯n(θ, I)) is close to one, and so F¯n is close to<br />

fn0(θ, I) = (θ + n0I, I + 2µ cos2π(θ + n0I)), n0 ∈ A, (θ, I) ∈ A.<br />

These "standard" maps can be seen as perturbations of iterates of the integrable twist<br />

map T(θ, I) = (θ + I, I): more precisely we can write fn = V ◦ T n where<br />

V (θ, I) = (θ, I + 2µ cos2πθ)<br />

is a "vertical" map which is close to identity if µ is small.<br />

We will first construct an orbit for the polysystem [[fn]]n∈A defined by the maps fn,<br />

and in the next section, we will use it as a pseudo-orbit for the skew-product [[F¯n]]¯n∈ΣA<br />

defined by the maps F¯n.<br />

7.5.0.3. The goal of this section is to prove the following proposition.<br />

Proposition 7.5. There exists a positive constant C such that for µ small enough, there<br />

exists an orbit (σ k (¯n), xk)k∈Z ∈ ΣA × A for the polysystem [[fn]]n∈A defined by<br />

such that<br />

and the estimates<br />

hold true.<br />

fn(θ, I) = (θ + nI, I + 2µ cos2π(θ + nI)), n ∈ A, (θ, I) ∈ A,<br />

|IN − I0| ≥ 2,<br />

lim<br />

k→±∞ Ik = ±∞,<br />

N<br />

k=0<br />

nk ≤ Cµ −1 ln µ −1 ,


7.5 - Construction of a pseudo-orbit 177<br />

0 K<br />

−K<br />

Figure 7: Interval Jδ (projection of BK onto T)<br />

We will see that this proposition holds only if we can choose the integers nk, k ∈ Z,<br />

as large as µ −1<br />

2 ln µ −1 , so this explains the choice of<br />

A = {[ln µ −1 ], . . ., 2[µ −1<br />

2 ln µ −1 ]}.<br />

The upper bound on the sum of the integers nk needed to produce a drift of order one<br />

is basically the "time of diffusion" in this context.<br />

In the sequel, we shall only explain the construction of the positive sequence<br />

(nk, θk, Ik)k≥0, since, of course, the construction of the negative sequence (nk, θk, Ik)k≤−1<br />

is completely similar.<br />

7.5.0.4. Let us now describe the construction of our orbit. In all this section we will<br />

need to introduce two real numbers 0 < K < K ′ < √ 2/2. First, we fix K such that<br />

0 < K < √ 2/2 and we define the non-empty domain<br />

BK = {(θ, I) ∈ A | cos 2πθ ≥ K, sin 2πθ ≤ −K}.<br />

We shall only need the first condition cos 2πθ ≥ K in this section, the other condition<br />

sin 2πθ ≤ −K will be used in the next section.<br />

One can also write<br />

BK = {(θ, I) ∈ A | − arccosK<br />

2π<br />

T<br />

Jδ<br />

arcsin K<br />

≤ θ ≤ −<br />

2π<br />

[Z]},<br />

that is BK = Jδ × R (see figure 7), where Jδ is an interval of T of length<br />

δ = arccosK<br />

2π<br />

−<br />

arcsin K<br />

.<br />

2π<br />

Now K being fixed, we shall also consider K ′ ∈ ]K, √ 2/2[ such that the domain BK ′ =<br />

Jδ/2 × R, where Jδ/2 is an interval of length δ/2. Now recall that we have fn = V ◦ T n<br />

with<br />

V (θ, I) = (θ, I + 2µ cos2πθ), T(θ, I) = (θ + I, I), (θ, I) ∈ A<br />

and by definition, the map V leaves the set BK invariant and produce in BK a drift in the<br />

I-direction which is at least equal to 2µK by the condition cos 2πθ ≥ K. Therefore, to


178 Optimal time of instability for a priori unstable Hamiltonian systems<br />

prove the first part of our proposition, we will construct a sequence of points (θk, Ik)k∈Z ∈<br />

A for which we can find a sequence of integers (nk)k∈Z ∈ A such that<br />

T nk (θk, Ik) ∈ BK. (68)<br />

Indeed, in this case (θk+1, Ik+1) = fnk (θk, Ik) = V ◦ T nk(θk, Ik) satisfies<br />

Ik+1 − Ik ≥ 2µK.<br />

Then to prove our second part, we shall need estimates on these integers nk, k ∈ Z. In<br />

fact the relation (68) can be written as<br />

R nk<br />

Ik (θk) = θk + nkIk ∈ Jδ,<br />

where RI is the rotation on the circle T of angle I. Most of the time, that is if either<br />

Ik is irrational or if Ik = p/q with |q| ≥ δ −1 , this will be realized, and the integers nk,<br />

k ∈ Z, can be estimated. This is an easy consequence of the "ergodization" theorem<br />

recalled below, that we shall use crucially in our construction.<br />

7.5.0.5. For I ∈ R, consider the rotation RI(θ) = θ + I defined on the circle T. Given<br />

0 < δ < 1, let Jδ be the set of intervals of T of length δ. We define the δ-ergodization<br />

time N(I, δ) ≤ +∞ by<br />

or equivalently<br />

N(I, δ) = inf{n ∈ N | {θ, . . .,R n I (θ)} ∩ J = ∅, ∀θ ∈ T, ∀J ∈ Jδ},<br />

N(I, δ) = inf{n ∈ N | <br />

0≤k≤n<br />

R −k<br />

I (J) = T, ∀J ∈ Jδ}.<br />

One can easily see that N(I, δ) < +∞ except if I is a rational number with a denominator<br />

smaller than δ −1 , but it is more difficult to prove that when it is defined, this<br />

number is essentially given by the inverse of the distance to these "bad" rationals. This<br />

is the content of the theorem below, which is due to Berti, Biasco and Bolle ([BBB03]).<br />

Theorem 7.6 (Berti-Bolle-Biasco). There exist a positive constant M such that if<br />

then for I ∈ R \ Rδ,<br />

Rδ = {p/q ∈ Q | |q| ≤ Mδ −1 },<br />

N(I, δ/2) ≤ d(I, Rδ) −1 .<br />

This is a consequence of Theorem 4.2 in [BBB03] (see also the estimate (5.3) in<br />

[BBB03]), where the above proposition is proved both for the continuous and multidimensional<br />

case. Of course one can give an explicit value for the numerical constant<br />

M but this will not be useful.<br />

In fact, most of the time we shall use this result in the following form.<br />

Lemma 7.7. Let I ∈ R \ Rδ, θ ∈ T and J ⊆ T any interval of length δ/2. Then for<br />

any m ∈ N, one can find an integer<br />

such that θ + nI ∈ J.<br />

n ∈ [m, m + d(I, Rδ) −1 ]


7.5 - Construction of a pseudo-orbit 179<br />

7.5.0.6. In order to construct our orbit, now we know that we have to consider the<br />

distance to this set of "bad rationals" Rδ.<br />

So given µ > 0, we first define the domain of "fast drift"<br />

<br />

DF(µ) = (θ, I) ∈ A | d(I, Rδ) ≥ 1<br />

ln µ −1<br />

<br />

,<br />

the domain of "slow drift"<br />

<br />

DS(µ) =<br />

and finally the domain of "resonances"<br />

<br />

Obviously we have<br />

DR(µ) =<br />

(θ, I) ∈ A | µ1 2<br />

ln µ −1 < d(I, Rδ) < 1<br />

ln µ −1<br />

(θ, I) ∈ A | d(I, Rδ) ≤ µ1 2<br />

ln µ −1<br />

A = DF(µ) ∪ DS(µ) ∪ DR(µ).<br />

But in the sequel, it will be more convenient to further decompose these sets. Indeed,<br />

for p/q ∈ Rδ, if we define<br />

<br />

<br />

and<br />

DS(µ, p/q) =<br />

DR(µ, p/q) =<br />

(θ, I) ∈ A | µ1 2<br />

1<br />

< |I − p/q| <<br />

ln µ −1 ln µ −1<br />

<br />

<br />

(θ, I) ∈ A | |I − p/q| ≤ µ1 2<br />

ln µ −1<br />

then, for µ small enough, one has<br />

DS(µ) = <br />

DS(µ, p/q), DR(µ) = <br />

and<br />

p/q∈Rδ<br />

A = DF(µ) <br />

⎛<br />

⎝ <br />

p/q∈Rδ<br />

p/q∈Rδ<br />

⎞<br />

DS(µ, p/q) ⎠ <br />

⎛<br />

⎝ <br />

p/q∈Rδ<br />

.<br />

<br />

<br />

,<br />

,<br />

DR(µ, p/q)<br />

⎞<br />

DR(µ, p/q) ⎠. (69)<br />

From now on we will assume that µ is sufficiently small, with respect to δ and M which<br />

are fixed, so the above decomposition (69) is in fact a partition of A (since the set Rδ is<br />

discrete) and moreover the following properties hold true: for any (θ, I) ∈ DS(µ), there<br />

exist a unique p/q ∈ Rδ such that (θ, I) ∈ DS(µ, p/q), and if<br />

m = inf{n ∈ N ∗ | fn(θ, I) /∈ DS(µ, p/q)},<br />

then m is well defined and necessarily the point (θm, Im) = fm(θ, I) does not belong<br />

to DS(µ) (either it is in DR(µ) or DF(µ), since the size of the jump, which is of order<br />

µ, is much smaller than the width of the connected components of DR(µ) or DF(µ)).<br />

We also ask the same for a point (θ, I) ∈ DR(µ): under the iteration of a map fn the


180 Optimal time of instability for a priori unstable Hamiltonian systems<br />

p2<br />

q2<br />

p1<br />

q1<br />

DS(µ)<br />

DF(µ)<br />

DR(µ)<br />

Figure 8: Domains DF(µ), DS(µ) and DR(µ)<br />

first time it escapes the domain DR(µ, p/q), it also escapes the domain DR(µ) (in fact<br />

it enters into the domain DS(µ)). This situation is depicted in figure 8.<br />

7.5.0.7. The construction of our orbit will be inductive, and we will start with a<br />

point (θ0, I0) ∈ DF(µ). Then we have the following easy application of the ergodization<br />

theorem.<br />

Lemma 7.8. Let (θ, I) ∈ DF(µ). There exists an integer<br />

such that T n (θ, I) ∈ BK ′.<br />

n ∈ {[ln µ −1 ], . . .,2[lnµ −1 ]}<br />

Proof. Recall that T n (θ, I) ∈ BK ′ if and only if θ + nI ∈ Jδ/2. Now by definition of<br />

DF(µ),<br />

d(I, Rδ) ≥ 1<br />

ln µ −1<br />

so the conclusion follows from Lemma 7.7 by choosing m = [ln µ −1 ].<br />

7.5.0.8. As long as the orbit stays in DF(µ), we can use the previous lemma. Then<br />

it will enter into the domain of slow drift, and this is where the ergodization theorem<br />

gives integers as large as µ −1<br />

2 ln µ −1 . However in the lemma below we will see how, after<br />

iterating a finite number of maps fnk , nk ∈ A, with nk of order µ −1 ln µ −1 , one can<br />

actually cross through this domain of slow drift.<br />

Lemma 7.9. Let (θ, I) ∈ DS(µ). There exist an index j ∈ N and integers n0, . . .,nj<br />

satisfying:<br />

(i) nk ∈ A, for 0 ≤ k ≤ j;<br />

(ii) fnk ◦ · · · ◦ fn0(θ, I) ∈ BK ′, for 0 ≤ k ≤ j;<br />

(iii) fnj ◦ · · · ◦ fn0(θ, I) /∈ DS(µ).<br />

Moreover, if j ∈ N ∗ , then


7.5 - Construction of a pseudo-orbit 181<br />

(iv) j<br />

k=0 nk ≤ (j + 1) lnµ −1 + (2K ′ ) −1 µ −1 ln µ −1 .<br />

Let us point out that in the proof of Proposition 7.5 below, the above lemma will<br />

always be used with j ≥ 1 so item (iv) will be available.<br />

Proof. There exists a unique p/q ∈ Rδ such that (θ, I) ∈ DS(µ, p/q). By Lemma 7.7<br />

(with m = [ln µ −1 ]) and since<br />

d(I, Rδ) = |I − p/q| −1 ≤ µ −1<br />

2 lnµ −1 ,<br />

we can find an integer n0 ∈ A such that fn0(θ, I) ∈ BK ′. Now if fn0(θ, I) /∈ DS(µ), then<br />

we can take j = 0 in the statement and assertions (i), (ii) and (iii) are proven.<br />

Otherwise, we construct inductively (nk)1≤k≤j where j ∈ N is defined by<br />

j = inf{k ∈ N ∗ | fnk ◦ · · · ◦ fn0(θ, I) /∈ DS(µ, p/q)}.<br />

Since our orbit always stays in B ′ K , then j is obviously well-defined, and at each step<br />

we have used Lemma 7.7 so conditions (i) and (ii) are satisfied. Moreover by a previous<br />

remark we can also write<br />

so<br />

and condition (iii) is also satisfied.<br />

j = inf{k ∈ N ∗ | fnk ◦ · · · ◦ fn0(θ, I) /∈ DS(µ)},<br />

fnj ◦ · · · ◦ fn0(θ, I) /∈ DS(µ),<br />

Finally, let us write (θ0, I0) = (θ, I) and (θk+1, Ik+1) = fnk ◦ · · · ◦ fn0(θ0, I0), for<br />

0 ≤ k ≤ j. Since these points belong to BK ′ we have Ik+1 − Ik ≥ 2µK ′ for 0 ≤ k ≤ j<br />

and hence<br />

j j<br />

nk ≤ ln µ −1 j 1<br />

+<br />

|Ik − p/q|<br />

k=0<br />

k=0<br />

k=0<br />

≤ (j + 1) lnµ −1 + (2K ′ ) −1 µ −1<br />

j<br />

k=0<br />

Ik+1 − Ik<br />

|Ik − p/q| .<br />

The second term on the right-hand side is a Riemann sum, hence for µ small enough it<br />

can be estimated by an integral, namely<br />

and as<br />

j<br />

k=0<br />

this eventually gives<br />

<br />

Ik+1 − Ik<br />

≤ 2<br />

|Ik − p/q|<br />

j<br />

k=0<br />

1<br />

ln µ<br />

2<br />

−1<br />

µ 1 2<br />

ln µ −1<br />

µ 1 2<br />

1<br />

ln µ −1 ≤|I−p/q|≤<br />

ln µ −1<br />

dI<br />

I<br />

dI<br />

= 2<br />

|I − p/q|<br />

= 2 lnµ−1 2 ≤ ln µ −1 ,<br />

nk ≤ (j + 1) lnµ −1 + (2K ′ ) −1 µ −1 ln µ −1 .<br />

This is exactly (iv), and so this ends the proof.<br />

1<br />

ln µ −1<br />

µ 1 2<br />

ln µ −1<br />

dI<br />

I ,


182 Optimal time of instability for a priori unstable Hamiltonian systems<br />

7.5.0.9. Now that we have escaped the domain of slow drift, we are in the resonant<br />

domain. Here we cannot use any ergodization result. However our point belongs B ′ K ,<br />

and in the lemma below we will show that by iterating j times a map fn, with n of order<br />

ln µ −1 , it can cross the resonant domain while staying in the larger domain BK.<br />

Lemma 7.10. Let (θ, I) ∈ DR(µ) ∩ BK ′. If µ is small enough, there exist integers<br />

and j ∈ N ∗ such that:<br />

n ∈ {[ln µ −1 ], . . .,2[lnµ −1 ]}<br />

(i) f k n(θ, I) ∈ DR(µ) ∩ BK, for 0 ≤ k ≤ j − 1;<br />

(ii) fj n (θ, I) /∈ DR(µ).<br />

Note that it is possible that the point fj n (θ, I) does not belong to BK either, but<br />

then we are back in the domain of slow drift and we can find an integer n ′ ∈ A such<br />

that T n′ (θ, I)) ∈ BK ′ ⊆ BK.<br />

(f j n<br />

Proof. There exists a unique p/q ∈ Rδ such that (θ, I) ∈ DR(µ, p/q) ∩ BK ′. We choose<br />

µ small enough so<br />

Mδ −1 ≤ ln µ −1 ,<br />

and as q ≤ Mδ −1 , we can find a integer d such that n = dq satisfies<br />

n ∈ {[lnµ −1 ], . . .,2[ln µ −1 ]}.<br />

We will also add a further smallness condition on µ by requiring that<br />

2 1<br />

<<br />

K ln µ −1 2π max{arccosK − arccosK′ , arcsin K ′ − arcsin K}.<br />

Let us define inductively<br />

f k n(θ, I) = (θk, Ik), 0 ≤ k ≤ j,<br />

where j is defined as follows: j = inf{j1, j2} with<br />

and<br />

j1 = inf{k ∈ N ∗ | (θk, Ik) /∈ DR(µ, p/q)}<br />

j2 =<br />

Kµ 1<br />

2 ln µ −1<br />

<br />

−1<br />

+ 1.<br />

It follows from the definition of j that (θk, Ik) ∈ DR(µ), for 0 ≤ k ≤ j − 1, so now let<br />

us show that (θk, Ik) ∈ BK.<br />

In fact we will prove below by induction on k that<br />

− arccosK′<br />

2π<br />

−<br />

Since n ≤ 2 lnµ −1 this implies<br />

1<br />

knµ 2<br />

ln µ −1 ≤ θk<br />

arcsin K′<br />

≤ − +<br />

2π<br />

knµ 1<br />

arccos K′<br />

− − 2kµ<br />

2π<br />

1 arcsin K′<br />

2 ≤ θk ≤ − + 2kµ<br />

2π<br />

1<br />

2,<br />

2<br />

ln µ −1, 0 ≤ k ≤ j − 1. (70)


7.5 - Construction of a pseudo-orbit 183<br />

and as k ≤ j2, then<br />

arccos K′<br />

− −<br />

2π<br />

2<br />

K ln µ −1 ≤ θk<br />

arcsin K′<br />

≤ − +<br />

2π<br />

arccos K<br />

−<br />

2π<br />

≤ θk ≤ −<br />

arcsin K<br />

.<br />

2π<br />

2<br />

K ln µ −1,<br />

This means that cos 2πθk ≥ K and sin 2πθk ≤ −K, and therefore this gives (θk, Ik) ∈ BK<br />

for 0 ≤ k ≤ j − 1.<br />

So now let us go through the induction, that is let us prove (70). This is obviously<br />

true for k = 0, so let us assume it is satisfied for some 0 ≤ k ≤ j − 2. We have<br />

θk+1 = θk + nrk,<br />

but since Ik ≥ p/q − µ 1<br />

2/ ln µ −1 and n = dq then<br />

θk+1 ≥ θk + dp − nµ 1<br />

2/ lnµ −1 = θk − nµ 1<br />

2/ lnµ −1<br />

as dp is an integer. Then, using the hypo<strong>thesis</strong> of induction this gives<br />

arccos K′<br />

θk+1 ≥ − −<br />

2π<br />

arccos K′<br />

≥ − −<br />

2π<br />

knµ 1<br />

2<br />

ln µ<br />

1<br />

nµ 2<br />

−<br />

−1 ln µ −1<br />

1<br />

(k + 1)nµ 2<br />

ln µ −1<br />

.<br />

Similarly using the fact that Ik ≤ p/q + µ 1<br />

2/ ln µ −1 one obtains<br />

therefore (70) is proven and so is (i).<br />

arcsin K′<br />

θk+1 ≤ − +<br />

2π<br />

(k + 1)nµ 1<br />

2<br />

ln µ −1<br />

,<br />

[Z],<br />

Now assume that j = j2, then since (θk, Ik) ∈ BK for 0 ≤ k ≤ j − 1,<br />

Ij2 ≥ I0 + 2j2Kµ > I0 + 2µ 1<br />

2/ lnµ −1 > p/q + µ 1<br />

2/ ln µ −1 ,<br />

which means that j2 ≥ j1. This is absurd, therefore j = j1, so (θj, Ij) /∈ DR(µ, p/q) and<br />

this means that (θj, Ij) /∈ DR(µ). This proves (ii).<br />

7.5.0.10. Now we can finally conclude the proof of Proposition 7.5.<br />

Proof of Proposition 7.5. We choose any point (θ0, I0) ∈ DF(µ). Applying successively<br />

Lemma 7.8, Lemma 7.9, Lemma 7.10 and Lemma 7.9 once again, in this precise order,<br />

one obtains a positive sequence (nk, θk, Ik)k∈N ∈ A × A which gives a positive orbit for<br />

our polysystem, that is<br />

fnk (θk, Ik) = (θk+1, Ik+1), k ∈ N.<br />

Note that since we have started in DF(µ), at each time Lemma 7.9 is applied with an<br />

integer j ≥ 1.


184 Optimal time of instability for a priori unstable Hamiltonian systems<br />

Now by construction, T nk(θk, Ik) ∈ BK for any k ∈ N, and since<br />

by definition of BK one obtains<br />

This clearly shows that<br />

(θk+1, Ik+1) = V ◦ T nk (θk, Ik),<br />

Ik+1 ≥ Ik + 2µK.<br />

lim<br />

k→+∞ Ik = +∞,<br />

which proves the first part of the statement.<br />

Then as<br />

if we set N = [(µK) −1 ] + 1,<br />

It remains to estimate the sum of integers<br />

To do that, we will write<br />

and<br />

so<br />

Ik − I0 ≥ 2kµK,<br />

IN − I0 ≥ 2.<br />

S =<br />

N<br />

nk.<br />

k=0<br />

σ1 = {k ∈ [0, N] | (θk, Ik) ∈ DF(µ) ∪ DR(µ)}<br />

σ2 = {k ∈ [0, N] | (θk, Ik) ∈ DS(µ)},<br />

S = <br />

nk + <br />

nk = S1 + S2.<br />

k∈σ1<br />

k∈σ2<br />

For each k ∈ σ1, we know that nk ≤ 2 lnµ −1 and hence<br />

S1 ≤ 2 <br />

k∈σ1<br />

ln µ −1 ≤ 2N lnµ −1 ≤ 4K −1 µ −1 ln µ −1 .<br />

To estimate S2, first observe that the set Rδ is discrete, so its intersection with the<br />

compact set {I0 ≤ I ≤ IN} is finite. But the latter set is included in {I0 ≤ I ≤ I0 + 3},<br />

so the constant<br />

M = card{p/q ∈ Rδ | I0 ≤ p/q ≤ I0 + 3},<br />

is independent of µ. Then setting<br />

we obtain<br />

σ2(p/q) = {k ∈ [0, N] | (θk, Ik) ∈ DS(µ, p/q)},<br />

S2 = <br />

nk ≤ M <br />

k∈σ2<br />

k∈σ2(p/q)<br />

nk.


7.6 - Proof of Theorem 7.2 185<br />

Now each σ2(p/q) is not reduced to a point, so by Lemma 7.9<br />

<br />

This finally gives<br />

k∈σ2(p/q)<br />

nk ≤ N ln µ −1 + (2K ′ ) −1 µ −1 ln µ −1<br />

≤ 2K −1 µ −1 ln µ −1 + (2K ′ ) −1 µ −1 ln µ −1<br />

≤ 2K −1 + (2K ′ ) −1 µ −1 ln µ −1 .<br />

S ≤ (4K −1 )µ −1 ln µ −1 + M 2K −1 + (2K ′ ) −1 µ −1 ln µ −1<br />

≤ Cµ −1 ln µ −1 ,<br />

with C = 2 sup {4K −1 , M (2K −1 + (2K ′ ) −1 )}, and this proves the proposition.<br />

7.6 Proof of Theorem 7.2<br />

7.6.0.1. In this section we will prove Theorem 7.2, and in view of Proposition 7.2, this<br />

will follow easily from the next result.<br />

Proposition 7.6. There exists a positive constant C such that for µ small enough, there<br />

exists an orbit (σ k (¯n), xk)k∈Z ∈ ΣA × A for the skew-product [[F¯n]]n∈A defined by<br />

such that<br />

F¯n(θ, I) = (θ + n0I, I + 2µf1(φ¯n(θ, I)) cos 2π(θ + n0I)), ¯n ∈ ΣA, (θ, I) ∈ A,<br />

and the estimates<br />

hold true.<br />

|I ′ N − I′ 0 | ≥ 1,<br />

lim<br />

k→±∞ I′ k = ±∞<br />

N<br />

k=0<br />

nk ≤ Cµ −1 ln µ −1 ,<br />

The strategy will be to consider the orbit constructed in Proposition 7.5 as a pseudoorbit<br />

for the skew-product. So in order to find a true orbit nearby, we shall need some<br />

hyperbolicity and this will be described below.<br />

7.6.0.2. Recall that<br />

BK = {(θ, I) ∈ A | cos 2πθ ≥ K, sin 2πθ ≤ −K}.<br />

The condition sin 2πθ ≤ −K will be used here to prove the following lemma.<br />

Lemma 7.11. Let x ∈ A such that T n (x) ∈ BK, for n ∈ A, and ˜x ∈ R 2 a lift of x.<br />

Then the eigenvalues λ± of dfn(˜x) are real and for µ small enough, they satisfy<br />

λ+ > 1 + 2 nπµK > 1, λ− < 1 − 2 nπµK + 4nπµK < 1.<br />

Moreover, if e± ∈ R 2 are eigenvectors associated to λ±, and for v ∈ R 2 , v = v+e++v−e−,<br />

then<br />

1<br />

2 |v| ≤ sup{|v+|, |v−|} ≤ a|v|,<br />

<br />

with a = O µ −3 4(ln µ −1 ) 1<br />

<br />

2 , and where | . | is the supremum norm on R2 .


186 Optimal time of instability for a priori unstable Hamiltonian systems<br />

Proof. Let us write x = (θ, I) ∈ A and ˜x = ( ˜ θ, I) ∈ R 2 . If we define<br />

s = −2πµ sin 2π(θ + nI),<br />

then s > 2πµK since T n (θ, I) = (θ + nI, I) ∈ BK. As<br />

we have<br />

fn(θ, I) = (θ + nI, I + 2µ cos 2π(θ + nI)), n ∈ A, (θ, I) ∈ A,<br />

dfn( ˜ θ, I) =<br />

Then the eigenvalues are real and given by<br />

Therefore we easily obtain<br />

<br />

1 n<br />

∈ M2(R).<br />

2s 1 + 2ns<br />

λ± = 1 + ns ± ns(2 + ns).<br />

λ+ > 1 + √ 2ns > 1 + 2 nπµK,<br />

and then using the equality λ+λ− = 1, for µ small enough one finds<br />

This proves the first part of the statement.<br />

Then if we define<br />

one can easily check that the vectors<br />

λ− < 1 − 2 nπµK + 4nπµK.<br />

α± = n −1 (λ± − 1) = s ± n −1 s(2 + ns),<br />

e± =<br />

<br />

1<br />

∈ R 2 ,<br />

are eigenvectors associated to λ±. Since |e+| = |e−| = 1, for v ∈ R 2 if we write<br />

it is trivial that<br />

As for the other inequality, note that<br />

If we define<br />

then one can check that<br />

α±<br />

||v|| = sup{|v+|, |v−|}, v = v+e+ + v−e−,<br />

|v| ≤ |v+| + |v−| ≤ 2||v||.<br />

|v| = sup{|v+ + v−|, |α+v+ + α−v−|}.<br />

r± = α+α− − 1 ± (α+ − α−)<br />

1 − α2 ,<br />

+<br />

|α+v+ + α−v−| ≤ |v+ + v−|<br />

if and only if v− = 0, or (v−) −1v+ ≤ r−, or (v−) −1v+ ≥ r+. Hence<br />

<br />

|v+ + v−| if v− = 0, or (v−) −1v+ ≤ r−, or (v−) −1v+ ≥ r+, ,<br />

|v| =<br />

|α+v+ + α−v−| if (v−) −1 v+ ≥ r−, or (v−) −1 v+ ≤ r+.


7.6 - Proof of Theorem 7.2 187<br />

If we study all the cases, one finds<br />

Then we can estimate<br />

||v|| ≤ sup 1, |1 + r+| −1 , |1 + r −1<br />

+ | −1 , |1 + r−| −1 , |1 + r −1<br />

− | −1 |v|.<br />

and using the fact that n ∈ A, one finds<br />

<br />

with a = O µ −3<br />

4(ln µ −1 ) 1<br />

2<br />

r± = −1 ± √ 2n−1 <br />

s + o n−1 µ ,<br />

<br />

.<br />

||v|| ≤ a|v|,<br />

7.6.0.3. Let us now describe an abstract fixed point theorem that will be used to find<br />

an orbit close to our pseudo-orbit.<br />

Consider a Banach space (E, | . |) and T : E → E a continuous linear map. Recall<br />

that the spectrum Sp(T) of T is the set of complex numbers λ such that TC − λIdC is<br />

not an automorphism of EC, where EC and TC are the complexifications of E and T.<br />

Given two real numbers κs, κu satisfying 0 < κs < 1 < κu, we say that T is (κs, κu)hyperbolic<br />

if<br />

Sp(T) ∩ {κs ≤ |z| ≤ κu} = ∅.<br />

In such a case, there exists a T-invariant decomposition<br />

and a constant c > 0 such that<br />

E = Es ⊕ Eu, T(Es) ⊆ Es, T(Eu) ⊆ Eu,<br />

|(T|Es) n | ≤ cη n s , |(T|Eu) −n | ≤ cη n u,<br />

for any n ∈ N, ηs < κs, ηu > κu and where | . | is the induced norm on linear operators.<br />

In fact, one can always find a norm . on E which is adapted to T in the following<br />

sense: . is equivalent to | . | and satisfies<br />

(i) xs + xu = sup{xs, xu}, xs ∈ Es, xu ∈ Eu;<br />

(ii) T|Es ≤ κs, (T|Eu) −1 ≤ κ −1<br />

u .<br />

The following theorem is proved in [Yoc95], section 2.1.<br />

Theorem 7.12. Let T : E → E be (κs, κu)-hyperbolic and let U : E → E be a Lipschitz<br />

map such that<br />

ε = Lip(U − T) < ε0 = inf{1 − κs, 1 − κ −1<br />

u }.<br />

Then U has a unique fixed point p ∈ E, and if . is a norm adapted to T,<br />

p < (ε0 − ε) −1 U(0).<br />

7.6.0.4. Now we can prove Proposition 7.6.


188 Optimal time of instability for a priori unstable Hamiltonian systems<br />

Proof of Proposition 7.6. Consider the orbit (σ k (¯n), θk, Ik)k∈Z ∈ ΣA ×A given by Proposition<br />

7.5. Then the proof is an immediate consequence of the following claim: there<br />

exist a sequence (θ ′ k , I′ k )k∈Z ∈ A such that (σ k (¯n), θ ′ k , I′ k )k∈Z ∈ ΣA × A is an orbit for the<br />

skew product and<br />

|Ik − I ′ k| ≤ µ 2 , k ∈ Z.<br />

So let us construct this orbit.<br />

Let xk = (θk, Ik)k∈Z ∈ A, and ˜xk = ( ˜ θk, Ik)k∈Z one of its lift in R 2 . First we define a<br />

linear map<br />

T : (R 2 ) Z −→ (R 2 ) Z<br />

v ↦−→ T(v)<br />

by<br />

(T(v))k = dfnk−1 (˜xk−1).vk−1, k ∈ Z.<br />

Using the supremum norm on R 2 let us define<br />

E = {v = (vk)k∈Z ∈ (R 2 ) Z | sup |vk| < ∞}.<br />

k∈Z<br />

Then E is obviously a Banach space with the norm<br />

|v| = sup |vk|, v ∈ E.<br />

k∈Z<br />

Now recall that for any ˜x = ( ˜ θ, I) ∈ R2 , if s = −2πµ sin2π( ˜ θ + nI), then<br />

<br />

1 n<br />

dfn(˜x) =<br />

∈ M2(R).<br />

2s 1 + 2ns<br />

Since the norm on M2(R) induced by the supremum norm on R2 is given by the maximum<br />

of the sums of the absolute values of the elements in each row, we obtain<br />

<br />

sup |dfnk<br />

k∈Z<br />

| C0 (R2 <br />

) = sup{1<br />

+ nk} = 1 + µ<br />

k∈Z<br />

−1<br />

2 ln µ −1 < ∞,<br />

then T(F) ⊆ F and therefore T is continuous. Moreover, by construction the sequence<br />

(xk)k∈Z satisfies T nk(xk) ∈ BK, so we can apply Lemma 7.11 and each map<br />

Tk−1 : vk−1 ↦−→ dfnk−1 (˜xk−1).vk−1<br />

is a (κ k−1<br />

s , κ k−1<br />

u )-hyperbolic linear map of R 2 , with<br />

and<br />

κ k s = 1 − 2 nkπµK + 4nkπµK < 1, k ∈ Z,<br />

κ k u = 1 + 2 nkπµK > 1, k ∈ Z.<br />

This implies that T is (κs, κu)-hyperbolic, with<br />

Since nk ∈ A, one finds<br />

κs = sup{κ<br />

k∈Z<br />

k s }, κu = inf<br />

k∈Z {κku }.<br />

κs = 1 − 2 √ πK(ln µ −1 ) 1<br />

2µ 1<br />

4 + 4πK ln µ −1 µ 1<br />

2 < 1,


7.6 - Proof of Theorem 7.2 189<br />

and<br />

Let us set<br />

Now if<br />

κu = 1 + 2 √ πK(ln µ −1 ) 1<br />

2µ 1<br />

2 > 1.<br />

ε0 = inf{1 − κs, 1 − κ −1<br />

<br />

u } = O µ 1<br />

v = sup vkk,<br />

k∈Z<br />

2(ln µ −1 ) 1<br />

2<br />

where . k is a norm in R 2 adapted to Tk, k ∈ Z, then . is adapted to T, and from<br />

Lemma 7.11<br />

<br />

with a = O µ −3<br />

by<br />

Next we define<br />

4(ln µ −1 ) 1<br />

2<br />

<br />

.<br />

Note that if v is a fixed point of U, then<br />

1<br />

|v| ≤ v ≤ a|v|, v ∈ E,<br />

2<br />

U : (R 2 ) Z −→ (R 2 ) Z<br />

v ↦−→ U(v)<br />

<br />

.<br />

(U(v))k = F σ k−1 (¯n)(˜xk−1 + vk−1) − ˜xk, k ∈ Z.<br />

F σ k−1 (¯n)(˜xk−1 + vk−1) = ˜xk + vk,<br />

so (σ k (¯n), x ′ k )k∈Z ∈ ΣA × A, where x ′ k ∈ A is the projection onto A of ˜x′ k = ˜xk + vk ∈<br />

R 2 , is an orbit for the skew-product. To prove that U has a fixed point, we will use<br />

Theorem 7.12 and for that we need to estimate the Lipschitz constant of ∆ = U − T<br />

with respect to the adapted norm . .<br />

with<br />

First if v, v ′ ∈ E satisfy |v| ≤ µ 2 , |v| ≤ µ 2 , then from Taylor formula one can compute<br />

|∆(v) − ∆(v ′ )| ≤ (L + 2Mµ 2 )|v − v ′ |,<br />

<br />

L = sup Lip fnk<br />

k∈Z<br />

− Fσk <br />

2<br />

(¯n) , M = sup |d fnk<br />

k∈Z<br />

|C0 (R2 <br />

) .<br />

Using the estimates (56) obtained in Proposition 7.2, one finds<br />

whereas<br />

L = O µ 4π−3/2 lnµ −1 ,<br />

M = O (ln µ −1 ) 2<br />

is obvious from the definition of the maps (fn)n∈A.<br />

However, far away from 0, the maps U and T are not necessarily close, so for v ∈<br />

(R 2 ) Z we define U(v) by<br />

U(v) =<br />

<br />

U(v) if |v| ≤ µ 2 ,<br />

U (µ 2 |v| −1 v) + (1 − µ 2 |v| −1 )T(v) if |v| ≥ µ 2 .


190 Optimal time of instability for a priori unstable Hamiltonian systems<br />

Then setting ∆ = U − T, for any v, v ′ ∈ E one easily obtains<br />

and this gives<br />

|∆(v) − ∆(v ′ )| ≤ 2(L + 2Mµ 2 )|v − v ′ |,<br />

∆(v) − ∆(v ′ ) ≤ 4a(L + 2Mµ 2 )v − v ′ .<br />

In particular, this shows U(F) ⊆ F, and the Lipschitz constant of U − T with respect<br />

to the adapted norm . is<br />

ε = Lip(U − T) ≤ 4a(L + 2Mµ 2 <br />

) = O µ 5<br />

4(ln µ −1 ) 5<br />

<br />

2 .<br />

As<br />

<br />

ε = O µ 5<br />

4(lnµ −1 ) 5<br />

<br />

2<br />

<br />

< ε0 = O µ 1<br />

2(ln µ −1 ) 1<br />

<br />

2<br />

we can finally apply Theorem 7.12: U has a unique fixed point v ′ ∈ E such that<br />

and hence<br />

Note that<br />

|U(0)| = sup<br />

k∈Z<br />

v ′ ≤ (ε0 − ε) −1 U(0),<br />

|v ′ | ≤ 2a(ε0 − ε) −1 |U(0)|.<br />

Using the estimates (56) in Proposition 7.2,<br />

{|Fσk (¯n)(˜xk) − ˜xk+1|} = sup{|Fσk<br />

(¯n)(˜xk) − fnk<br />

k∈Z<br />

(˜xk)|}.<br />

sup |Fσk (¯n)(˜xk) − fnk<br />

k∈Z<br />

(˜xk)| = O µ 4π−1<br />

and as 2a(ε0 − ε) −1 = O µ −5/4 , we have in particular<br />

|v ′ | ≤ µ 2 . (71)<br />

By definition of U, this shows that v ′ is in fact a fixed point of U, and therefore the<br />

sequence (σk (¯n), x ′ k )k∈Z ∈ ΣA × A, where x ′ k is the projection onto A of<br />

˜x ′ k = ˜xk + v ′ k = (˜ θ ′ k , I′ k ), k ∈ Z,<br />

is an orbit for the skew-product. If we set x ′ k = (θ′ k , I′ k ) ∈ A, then from (71) we obtain<br />

This concludes the proof.<br />

|Ik − I ′ k| ≤ µ 2 , k ∈ Z.<br />

7.6.0.5. Now we can eventually prove Theorem 7.2.<br />

Proof of Theorem 7.2. Let (σ k (¯n), θ ′ k , I′ k )k∈Z ∈ ΣA ×A be the orbit for the skew-product<br />

obtained in Proposition 7.6. Then, by Proposition 7.2,<br />

and in the original coordinates<br />

Υµ(nk, θ ′ k, I ′ k) = (τk, ek, θ ′ k, I ′ k) ∈ E × A, k ∈ Z,<br />

Ψ −1 (τk, ek, θ ′ k , I′ k ) = (θk 1 , Ik 1 , θk 2 , Ik 2 ) ∈ A2 , k ∈ Z,<br />

is an orbit for the transversal map ˜ Fµ. By definition of the latter one has<br />

F nk<br />

µ (θk 1 , Ik 1 , θk 2 , Ik 2 ) = (θk+1 1 , I k+1<br />

1 , θ k+1<br />

2 , I k+1<br />

2 ), k ∈ Z,<br />

and since I ′ k = I2 k for k ∈ Z, the theorem follows.


7.A - Time-energy coordinates for the pendulum 191<br />

7.A Time-energy coordinates for the pendulum<br />

In this appendix, we recall some elementary facts about the time-energy coordinates for<br />

the simple pendulum. We refer to [MS04], [Mar05] and [LM05] for more details.<br />

7.A.0.1. Consider the simple pendulum defined by the Hamiltonian<br />

and the open domain<br />

We define the energy function<br />

P(θ, I) = 1<br />

2 I2 + cos 2πθ,<br />

E = {(θ, I) ∈ A | 0 < θ < 1, I > 0}.<br />

e(θ, I) = P(θ, I) − 1 = 1<br />

2 I2 + cos 2πθ − 1<br />

and, using {θ = 1/2} as a reference section, the time function<br />

τ(θ, I) =<br />

θ<br />

1<br />

2<br />

dθ<br />

2(e(θ, I) − V (θ))<br />

where V (θ) = cos 2πθ − 1. For positive energy e > 0, the period of motion is given by<br />

T(e) =<br />

1<br />

2<br />

− 1<br />

2<br />

dθ<br />

2(e − V (θ))<br />

and it is a decreasing function. Moreover, we have the equivalent<br />

T(e) ∼0 −(2π) −1 ln e. (72)<br />

For negative energy e < 0, if θ(e) is such that cos 2πθ(e) = 1 + e, then<br />

Therefore if we define<br />

we have a diffeomorphism<br />

T(e) =<br />

θ(e)<br />

−θ(e)<br />

dθ<br />

2(e − V (θ)) .<br />

E∗ = {(τ, e) ∈ R 2 | e > −2, |τ| < 1<br />

2 T(e)}<br />

Ψ : E −→ E∗<br />

(θ, I) ↦−→ (τ, e)<br />

and one can check that it is symplectic. Moreover, in those coordinates (τ, e) the flow<br />

of the pendulum is straightened out, that is<br />

for t small enough.<br />

Φ tP (τ, e) = (τ + t, e).


192 Optimal time of instability for a priori unstable Hamiltonian systems<br />

7.A.0.2. To conclude, for the proof of Proposition 7.2 in section 7.4.2 we shall need<br />

some estimates on the energy and the period of the periodic orbits (of positive energy)<br />

for the pendulum. For n ∈ N ∗ , we let en be the energy of the n-periodic orbit for Φ P ,<br />

that is T(en) = n, we have<br />

en ∼+∞ exp(−2πn). (73)<br />

Then if we define T ′ n = T ′ (en), T ′′<br />

n = T ′′ (en) where T ′ and T ′′ are the first and second<br />

derivatives of the period function, we can deduce from (72) and (73) that<br />

and<br />

T ′ n ∼+∞ −(2π) −1 exp(2πn) (74)<br />

T ′′<br />

n ∼+∞ 2π(T ′ n )2 = (2π) −1 exp(4πn). (75)


8.1 - Introduction 193<br />

8 Time of instability for high-dimensional Hamiltonian<br />

systems<br />

In this chapter, we use a mechanism introduced by Herman, Marco and Sauzin to show<br />

that if a perturbation of a quasi-convex integrable Hamiltonian system is not too small<br />

with respect to the number of degrees of freedom, then the classical exponential stability<br />

estimates do not hold. Indeed, we construct an unstable solution whose drifting time<br />

is polynomial with respect to the inverse of the size of the perturbation. A different<br />

example was already given by Bourgain and Kaloshin, with a linear time of drift but<br />

with a perturbation which is larger than ours. As a consequence, we obtain a better<br />

upper bound on the threshold of validity of exponential stability estimates. This chapter<br />

follows [Bou10a].<br />

8.1 Introduction<br />

8.1.0.1. Consider a near-integrable Hamiltonian system of the form<br />

<br />

H(θ, I) = h(I) + f(θ, I)<br />

|f| < ε<br />

with angle-action coordinates (θ, I) ∈ T n × R n , and where f is a small perturbation, of<br />

size ε, in some suitable topology defined by a norm | . |.<br />

If the system is analytic and h satisfies a generic condition, it is a remarkable result<br />

due to Nekhoroshev ([Nek77], [Nek79]) that the action variables are stable for an exponentially<br />

long interval of time with respect to the inverse of the size of the perturbation:<br />

one has<br />

|I(t) − I0| ≤ c1ε b , |t| ≤ c2 exp(c3ε −a ),<br />

for some positive constants c1, c2, c3, a, b and provided that the size of the perturbation<br />

ε is smaller than a threshold ε0. Of course, all these constants strongly depend on the<br />

number of degrees of freedom n, and when the latter goes to infinity, the threshold ε0<br />

and the exponent of stability a go to zero.<br />

More precisely, in the case where h is quasi-convex and the system is analytic or<br />

even Gevrey, then we know that the exponent a is of order n −1 and this is essentially<br />

optimal (see [LN92], [Pös93], [MS02], [LM05] [Zha09] and the sixth chapter for more<br />

information on the optimality of the stability exponent).<br />

8.1.0.2. This fact was used by Bourgain and Kaloshin in [BK05] to show that if the<br />

size of the perturbation is<br />

εn ∼ e −n ,<br />

then there is no exponential stability: they constructed unstable solutions for which the<br />

time of drift is linear with respect to the inverse of the size of the perturbation, that is<br />

|I(τn) − I0| ∼ 1, τn ∼ ε −1<br />

n .<br />

Their motivation was the implementation of stability estimates in the context of Hamiltonian<br />

partial differential equations, which requires to understand the relative dependence<br />

between the size of the perturbation and the number of degrees of freedom. Their


194 Time of instability for high-dimensional Hamiltonian systems<br />

result indicates that for infinite dimensional Hamiltonian systems, Nekhoroshev’s mechanism<br />

does not survive and that "fast diffusion" should prevail. Of course, in their<br />

example, one cannot simply take n = ∞ as the size of the perturbation εn ∼ e −n goes to<br />

zero and the time of instability τn ∼ e n goes to infinity exponentially fast with respect to<br />

n. A more precise interpretation concerns the threshold of validity ε0 in Nekhoroshev’s<br />

theorem: it has to satisfy<br />

ε0


8.2 - Main result 195<br />

Let us recall that, given α ≥ 1 and L > 0, a function f ∈ C∞ (Tn × B) is (α, L)-Gevrey<br />

if, using the standard multi-index notation,<br />

|f|α,L = <br />

L |l|α (l!) −α |∂ l f| C0 (Tn ×B) < ∞.<br />

l∈N 2n<br />

The space of such functions, with the above norm, is a Banach space that we denote by<br />

G α,L (T n × B). One can see that analytic functions correspond exactly to α = 1.<br />

8.2.0.2. Now we can state our theorem.<br />

Theorem 8.1. Let n ≥ 3, R > 1, α > 1 and L > 0. Then there exist positive constants<br />

c, γ, C and n0 ∈ N ∗ depending only on R, α and L such that for any n ≥ n0, the following<br />

holds: there exists a function fn ∈ G α,L (T n × B) with εn = |fn|α,L satisfying<br />

e −2(n−2) ln(4n ln 2n) ≤ εn ≤ c e −2(n−2) ln(n ln 2n) ,<br />

such that the Hamiltonian system Hn = h + fn has an orbit (θ(t), I(t)) for which the<br />

estimates<br />

|I(τn) − I0| ≥ 1,<br />

nγ c<br />

τn ≤ C ,<br />

hold true.<br />

As we have already explained, this statement gives an upper bound on the threshold<br />

of applicability of Nekhoroshev’s estimates, which is an important issue when trying<br />

to use abstract stability results for "realistic" problems, for instance for the so-called<br />

planetary problem (see [Nie96]).<br />

So let us consider the set of Gevrey quasi-convex integrable Hamiltonians H =<br />

H(n, R, α, L, M, m) defined as follows: h ∈ H if h ∈ G α,L (B) and satisfies both<br />

and<br />

∀I ∈ B, |∂ k h(I)| ≤ M, 1 ≤ |k1| + · · · + |kn| ≤ 3,<br />

∀I ∈ B, ∀v ∈ R n , ∇h(I).v = 0 =⇒ ∇ 2 h(I)v.v ≥ m|v| 2 .<br />

From Nekhoroshev’s theorem (see [MS02] for a statement in Gevrey classes), we know<br />

that there exists a positive constant ε0(H) = ε0(n, R, α, L, M, m) such that the following<br />

holds: for any h ∈ H, there exist positive constants c1, c2, c3, a and b such that if<br />

εn<br />

f ∈ G α,L (T n × B), |f|α,L < ε0(H),<br />

then any solution (θ(t), I(t)) of the system H = h + f, with I(0) ∈ BR/2, satisfies<br />

|I(t) − I0| ≤ c1ε b , |t| ≤ c2 exp(c3ε −a ).<br />

Then we can state the following corollary of our Theorem 8.1.<br />

Corollary 8.2. With the previous notations, one has the upper bound<br />

ε0(H) < e −2(n−2) ln(4n ln2n) .


196 Time of instability for high-dimensional Hamiltonian systems<br />

This improves the upper bound ε0(H) < e −n obtained in [BK05]. For concrete<br />

Hamiltonians like the planetary problem, the actual distance to the integrable system<br />

is slightly less than 10 −3 , hence the above corollary yields the impossibility to apply<br />

Nekhoroshev’s estimates for n > 3.<br />

8.2.0.3. Theorem 8.1 is easily obtained from the example in [MS02]: in fact, the<br />

construction is completely similar but one has to pay attention to the dependence with<br />

respect to n of the various constants involved. Such a result can easily be obtained in<br />

the analytic category, using the more elaborated techniques developed in [LM05], but we<br />

will restrict to the Gevrey case in order to describe entirely this very simple mechanism<br />

of instability.<br />

As the reader will see, we will use only rough estimates leading to the factor n in the<br />

time of instability: this can be easily improved but we do not know if it is possible in<br />

our case (that is with a perturbation of size e −nln(n ln n) ) to obtain a linear time of drift.<br />

Finally, as in [MS02], we have restricted the perturbation to a compact subset of<br />

A n = T n × R n just in order to evaluate Gevrey norms. In fact, the Hamiltonian vector<br />

field generated by Hn = h+fn is complete and the unstable solution (θ(t), I(t)) satisfies<br />

lim<br />

t→±∞ I1(t) = ±∞,<br />

which means that it is bi-asymptotic to infinity.<br />

8.2.0.4. In this text, we will have to deal with time-one maps associated to Hamiltonian<br />

flows. So given a function H, we will denote by Φ H t the time-t map of its Hamiltonian<br />

flow and by Φ H = Φ H 1 the time-one map. We shall use the same notation for timedependent<br />

functions H, that is Φ H will be the time-one map of the Hamiltonian isotopy<br />

(the flow between t = 0 and t = 1) generated by H.<br />

8.3 Proof of Theorem 8.1<br />

The proof of Theorem 8.1 is contained in section 8.3.2, but first in section 8.3.1 we recall<br />

the mechanism of instability presented in the paper [MS02] (see also [MS04]).<br />

This mechanism has two main features. The first one is that it deals with perturbation<br />

of integrable maps rather than perturbation of integrable flows, and then the<br />

latter is recovered by a suspension process. This point of view, which is only of technical<br />

matter, was already used for example in [Dou88] and offers more flexibility in the<br />

construction. The second feature, which is the most important one, is that instead of<br />

trying to detect instability in a map close to integrable by means of the usual splitting<br />

estimates, we will start with a map having already "unstable" orbits and try to embed<br />

it in a near-integrable map. This will be realized through a "coupling lemma", which is<br />

really the heart of the mechanism.<br />

As we will see, the construction offers an easy and very efficient way of computing<br />

the drifting time of unstable solutions, therefore avoiding all the technicalities that are<br />

usually required for such a task.


8.3 - Proof of Theorem 8.1 197<br />

8.3.1 The mechanism<br />

•<br />

•<br />

•<br />

•<br />

•<br />

ψq<br />

ψ −1<br />

q<br />

Figure 9: Drifting point for the map ψq<br />

I = q −1<br />

I = 0<br />

I = −q −1<br />

8.3.1.1. Given a potential function U : T → R, we consider the following family of<br />

maps ψq : A → A defined by<br />

ψq(θ, I) = θ + qI, I − q −1 U ′ (θ + qI) , (θ, I) ∈ A,<br />

for q ∈ N ∗ . If we require U ′ (0) = −1, for example if we choose<br />

U(θ) = −(2π) −1 sin 2πθ, U ′ (θ) = − cos(2πθ),<br />

then it is easy to see that ψq(0, 0) = (0, q −1 ) and by induction<br />

ψ k q (0, 0) = (0, kq−1 ) (76)<br />

for any k ∈ Z (see figure 9). After q iterations, the point (0, 0) drifts from the circle I = 0<br />

to the circle I = 1 and it is bi-asymptotic to infinity, in the sense that the sequence<br />

k ψq (0, 0) <br />

is not contained in any semi-infinite annulus of A. Clearly these maps<br />

k∈Z<br />

are exact-symplectic, but obviously they have no invariant circles and so they cannot<br />

be "close to integrable". However, we will use the fact that they can be written as a<br />

composition of time-one maps,<br />

ψq = Φ q−1 U ◦<br />

<br />

Φ 1<br />

2I2 ◦ · · · ◦ Φ 1<br />

2I2 = Φ q−1 <br />

U<br />

◦ Φ 1<br />

2I2 q<br />

, (77)<br />

to embed ψq in the q th -iterate of a near-integrable map of A n , for n ≥ 2. To do so, we<br />

will use the following "coupling lemma", which is easy but very clever.<br />

Lemma 8.3 (Herman-Marco-Sauzin). Let m, m ′ ≥ 1, F : Am → Am and G : Am′ →<br />

Am′ two maps, and f : Am → R and g : Am′ → R two Hamiltonian functions generating<br />

complete vector fields. Suppose there is a point a ∈ Am′ which is q-periodic for G and<br />

such that the following "synchronisation" conditions hold:<br />

g(a) = 1, dg(a) = 0, g(G k (a)) = 0, dg(G k (a)) = 0, (S)


198 Time of instability for high-dimensional Hamiltonian systems<br />

for 1 ≤ k ≤ q − 1. Then the mapping<br />

is well-defined and for all x ∈ A m ,<br />

Ψ = Φ f⊗g ◦ (F × G) : A m+m′<br />

−→ A m+m′<br />

Ψ q (x, a) = (Φ f ◦ F q (x), a).<br />

The product of functions acting on separate variables was denoted by ⊗, i.e.<br />

f ⊗ g(x, x ′ ) = f(x)g(x ′ ) x ∈ A m , x ′ ∈ A m′<br />

.<br />

Let us give an elementary proof of this lemma since it is a crucial ingredient.<br />

Proof. First note that since the Hamiltonian vector fields Xf and Xg are complete, an<br />

easy calculation shows that for all x ∈ Am , x ′ ∈ Am′ and t ∈ R, one has<br />

Φ f⊗g<br />

t (x, x ′ <br />

) =<br />

Φ g(x′ )f<br />

t<br />

(x), Φ f(x)g<br />

t (x ′ )<br />

and therefore Xf⊗g is also complete. Using the above formula and condition (S), the<br />

points (F k (x), G k (a)), for 1 ≤ k ≤ q − 1, are fixed by Φ f⊗g and hence<br />

Since a is q-periodic for G this gives<br />

and we end up with<br />

using once again (S) and (78).<br />

Ψ q−1 (x, a) = (F q−1 (x), G q−1 (a)).<br />

Ψ q (x, a) = Φ f⊗g (F q (x), a),<br />

Ψ q (x, a) = (Φ f (F q (x)), a)<br />

Therefore, if we set m = 1, F = Φ 1<br />

2 I2 1 and f = q −1 U in the coupling lemma, the<br />

q th -iterate Ψ q will leave the submanifold A × {a} invariant, and its restriction to this<br />

annulus will coincide with our "unstable map" ψq. Hence, after q 2 iterations of Ψ, the<br />

I1-component of the point ((0, 0), a) ∈ A 2 will move from 0 to 1.<br />

8.3.1.2. The difficult part is then to find what kind of dynamics we can put on the<br />

second factor to apply this coupling lemma. In order to have a continuous system with<br />

n degrees of freedom at the end, we may already choose m ′ = n − 2 so the coupling<br />

lemma will give us a discrete system with n − 1 degrees of freedom.<br />

First, a natural attempt would be to try<br />

Indeed, in this case<br />

G = Gn = Φ 1<br />

2 I2 2 +···+1<br />

2 I2 n−1.<br />

F × Gn = Φ 1<br />

2 I2 2 +···+1<br />

2 I2 n = Φ h<br />

and the unstable map Ψ given by the coupling lemma appears as a perturbation of<br />

the form Ψ = Φ u ◦ Φ h , with u = f ⊗ g. However, this cannot work. Indeed, for<br />

<br />

(78)


8.3 - Proof of Theorem 8.1 199<br />

j ∈ {2, . . ., n − 1}, one can choose a pj-periodic point a (j) ∈ A for the map Φ 1<br />

2 I2 j , and<br />

then setting<br />

an = (a (2) , . . .,a (n−1) ) ∈ A n−2 , qn = p2 · · ·pn−1,<br />

the point an is qn-periodic for Gn provided that the numbers pj are mutually prime.<br />

One can easily see that the latter condition will force the product qn to converge to<br />

infinity when n goes to infinity. So necessarily the point an gets arbitrarily close to its<br />

first iterate Gn(an) when n (and therefore qn) is large: this is because qn-periodic points<br />

for Gn are equi-distributed on qn-periodic tori. As a consequence, a function gn with<br />

the property<br />

gn(an) = 1, gn(Gn(an)) = 0,<br />

will necessarily have very large derivatives at an if qn is large. Then as the size of the<br />

perturbation is essentially given by<br />

|f ⊗ gn| = |q −1<br />

n U ⊗ gn| = |q −1<br />

n ||gn|,<br />

one can check that it is impossible to make this quantity converge to zero when the<br />

number of degrees of freedom n goes to infinity.<br />

8.3.1.3. As in [MS02], the idea to overcome this problem is the following one. We<br />

introduce a new sequence of "large" parameters Nn ∈ N ∗ and in the second factor we<br />

consider a family of suitably rescaled penduli on A given by<br />

Pn(θ2, I2) = 1<br />

2 I2 2<br />

−2<br />

+ Nn V (θ2),<br />

where V (θ) = − cos 2πθ. The other factors remain unchanged, so<br />

Gn = Φ 1<br />

2 (I2 2 +I2 3 +···+I2 −2<br />

n−1 )+Nn V (θ2)<br />

.<br />

In this case, the map Ψ given by the coupling lemma is also a perturbation of Φ h but of<br />

the form Ψ = Φ u ◦Ψ h+v , with v = N −2<br />

n V . But now for the map Gn, due to the presence<br />

of the pendulum factor, it is now possible to find a periodic orbit with an irregular<br />

distribution: more precisely, a qn-periodic point an such that its distance to the rest of<br />

its orbit is of order N −1<br />

n , no matter how large qn is.<br />

8.3.1.4. Let us denote by (pj)j≥0 the ordered sequence of prime numbers and let us<br />

choose Nn as the product of the n − 2 prime numbers {pn+3, . . .,p2n}, that is<br />

Nn = pn+3pn+4 · · ·p2n ∈ N ∗ .<br />

Our goal is to prove the following proposition.<br />

Proposition 8.1. Let n ≥ 3, α > 1 and L1 > 0. Then there exist a function gn ∈<br />

G α,L1 (A n−2 ), a point an ∈ A n−2 and positive constants c1 and c2 depending only on α<br />

and L1 such that if<br />

<br />

Mn = 2 c1Nne c2(n−2)p 1 <br />

α<br />

2n , qn = NnMn,<br />

then an is qn-periodic for Gn and (gn, Gn, an, qn) satisfy the synchronization conditions<br />

(S):<br />

gn(an) = 1, dgn(an) = 0, gn(G k n (an)) = 0, dgn(G k n (an)) = 0,


200 Time of instability for high-dimensional Hamiltonian systems<br />

− 1<br />

2<br />

b<br />

×<br />

×<br />

×<br />

× ×<br />

× ×<br />

× ×<br />

M<br />

for 1 ≤ k ≤ qn − 1. Moreover, the estimate<br />

holds true.<br />

−σ σ<br />

Figure 10: The point b M and its iterates<br />

q −1<br />

n |gn|α,L1 ≤ N −2<br />

n , (79)<br />

The rest of this section is devoted to the proof of the above proposition. Note that<br />

together with the coupling lemma (Lemma 8.3), this proposition easily gives a result<br />

of instability (analogous to Proposition 2.1 in [MS02]) for a discrete system which is a<br />

perturbation of the map Φ h .<br />

8.3.1.5. We first consider the simple pendulum<br />

P(θ, I) = 1<br />

2 I2 + V (θ), (θ, I) ∈ A.<br />

With our convention, the stable equilibrium is at (0, 0) and the unstable one is at<br />

(0, 1/2). Given any M ∈ N ∗ , there is a unique point b M = (0, IM) which is M-periodic<br />

for Φ P (this is just the intersection between the vertical line {0} × R and the closed<br />

orbit for the pendulum of period M). One can check that IM ∈ ]2, 3[ and as M goes to<br />

infinity, (0, IM) tends to the point (0, 2) which belongs to the upper separatrix. Since<br />

Pn(θ, I) = 1<br />

2 I2 + N −2<br />

n V (θ), then one can see that<br />

Φ Pn = (Sn) −1 −1<br />

Nn P<br />

◦ Φ ◦ Sn,<br />

where Sn(θ, I) = (θ, NnI) is the rescaling by Nn in the action components. Therefore<br />

the point bM −1<br />

n = (0, Nn IM) is qn-periodic for ΦPn , for qn = NnM. Let (ΦP t )t∈R be the<br />

flow of the pendulum, and<br />

Φ P t (0, IM) = (θM(t), IM(t)).<br />

The function θM(t) is analytic. The crucial observation is the following simple property<br />

of the pendulum (see Lemma 2.2 in [MS02] for a proof).<br />

Lemma 8.4. Let σ = −1 2 + 2 π arctaneπ < 1<br />

2 . For any M ∈ N∗ ,<br />

for t ∈ [1/2, M − 1/2].<br />

θM(t) /∈ [−σ, σ],<br />

1<br />

2


8.3 - Proof of Theorem 8.1 201<br />

Hence no matter how large M is, most of the points of the orbit of b M ∈ A will be<br />

outside the set {−σ ≤ θ ≤ σ} × R (see figure 10). The construction of a function that<br />

vanishes, as well as its first derivative, at these points, will be easily arranged by means<br />

of a function, depending only on the angle variables, with support in {−σ ≤ θ ≤ σ}.<br />

As for the other points, it is convenient to introduce the function<br />

τM : [−σ, σ] −→ ] − 1/2, 1/2[<br />

which is the analytic inverse of θM. One can give an explicit formula for this map:<br />

τM(θ) =<br />

θ<br />

0<br />

dϕ<br />

I 2 M − 4 sin 2 πϕ .<br />

In particular, it is analytic and therefore it belongs to G α,L1 ([−σ, σ]) for α ≥ 1 and<br />

L1 > 0, and one can obtain the following estimate (see Lemma 2.3 in [MS02] for a<br />

proof).<br />

Lemma 8.5. For α > 1 and L1 > 0,<br />

Λ = sup<br />

M∈N∗ |τM|α,L1 < +∞.<br />

Note that Λ depends only on α and L1. Under the action of τM, the points of the<br />

orbit of b M whose projection onto T belongs to {−σ ≤ θ ≤ σ} get equi-distributed, and<br />

we can use the following elementary lemma.<br />

Lemma 8.6. For p ∈ N ∗ , the analytic function ηp : T → R defined by<br />

satisfies<br />

for 1 ≤ k ≤ p − 1, and<br />

<br />

p−1<br />

1 <br />

ηp(θ) = cos 2πlθ<br />

p<br />

l=0<br />

2<br />

ηp(0) = 1, η ′ p(0) = 0, ηp(k/p) = η ′ p(k/p) = 0,<br />

|ηp|α,L1 ≤ e 2αL1(2πp) 1 α .<br />

The proof is trivial (see [MS02], Lemma 2.4).<br />

8.3.1.6. We can now pass to the proof of Proposition 8.1.<br />

Proof of Proposition 8.1. For α > 1 and L1 > 0, consider the bump function ϕα,L1 ∈<br />

G α,L1 (T) given by Lemma 8.8 (see Appendix 8.A).<br />

We choose our function gn ∈ Gα,L1 n−2 (A ), depending only on the angle variables, of<br />

the form<br />

gn = g (2)<br />

n ⊗ · · · ⊗ g(n−1) n ,<br />

where<br />

g (2)<br />

n (θ2) = ηpn+3(τMn(θ2))ϕ −<br />

α,(4σ) 1 ((4σ) α L1<br />

−1 θ2),


202 Time of instability for high-dimensional Hamiltonian systems<br />

and<br />

Let us write<br />

Now we choose our point an = (a (2)<br />

and<br />

g (i)<br />

n (θi) = ηpn+1+i (θi), 3 ≤ i ≤ n − 1.<br />

a (i)<br />

n<br />

<br />

<br />

<br />

<br />

c1 = ϕ −<br />

α,(4σ) 1 α L1 −<br />

α,(4σ) 1 α L1<br />

a (2)<br />

n<br />

n , . . .,a (n−1)<br />

n<br />

= bMn<br />

n<br />

) ∈ A n−2 . We set<br />

= (0, N −1<br />

n IMn),<br />

= (0, p−1 n+1+i ), 3 ≤ i ≤ n − 1.<br />

Let us prove that an is qn-periodic for Gn. We can write<br />

Gn = Φ 1<br />

2 I2 −2 1<br />

2 +Nn V (θ2)<br />

× Φ 2 (I2 3 +I2 4 +···+I2 n−1 ) = Φ Pn × G.<br />

Since pn+4, . . .,p2n are mutually prime, the point (a (3)<br />

G, with period<br />

N ′ n = pn+4 · · ·p2n.<br />

.<br />

n , . . .,a (n−1)<br />

n<br />

) ∈ A n−3 is periodic for<br />

By construction, the point a (2)<br />

n = b Mn<br />

n ∈ A is periodic for ΦPn , with period qn = NnMn,<br />

where<br />

Nn = pn+3pn+4 · · ·p2n.<br />

This means that an is periodic for the product map Gn, and the exact period is given<br />

by the least common multiple of qn and N ′ n . Since N ′ n divides qn, the period of an is qn.<br />

Now let us show that the synchronization conditions (S) hold true, that is<br />

gn(an) = 1, dgn(an) = 0, gn(G k n (an)) = 0, dgn(G k n (an)) = 0,<br />

for 1 ≤ k ≤ qn − 1. Since ϕα,L1(0) = 1, then<br />

and as ϕ ′ (0) = 0, then<br />

α,L1<br />

gn(an) = g (2)<br />

n<br />

(0) · · ·g(n−1) n (0) = 1<br />

dgn(an) = 0.<br />

To prove the other conditions, let us write G k n (an) = (θk, Ik) ∈ A n−2 , for 1 ≤ k ≤ qn −1.<br />

If θ (2)<br />

k<br />

does not belong to ] − σ, σ[, then g(2) n and its first derivative vanish at θ (2)<br />

k<br />

, so<br />

because it is the case for ϕ α,(4σ) − 1 α L<br />

Otherwise, if −σ < θ (2)<br />

k<br />

and therefore<br />

gn(θk) = dgn(θk) = 0.<br />

< σ, one can easily check that<br />

− Nn − 1<br />

2<br />

≤ k ≤ Nn − 1<br />

2<br />

τMn(θ (2) k<br />

k ) = ,<br />

Nn


8.3 - Proof of Theorem 8.1 203<br />

while<br />

θ (i)<br />

k<br />

k<br />

= , 3 ≤ i ≤ n − 1.<br />

pn+i+1<br />

If N ′ n = pn+4 · · ·p2n divides k, that is k = k ′ N ′ n for some k ′ ∈ Z, then<br />

τMn(θ (2) k<br />

k ) =<br />

Nn<br />

= k′<br />

pn+3<br />

and therefore, by Lemma 8.6, ηpn+3 vanishes with its differential at θ (2)<br />

k , and so does<br />

g (2)<br />

n . Otherwise, N ′ n<br />

functions ηpn+1+i<br />

does not divide k and then, for 3 ≤ i ≤ n − 1, at least one of the<br />

vanishes with its differential at θ(2)<br />

k<br />

gn(θk) = dgn(θk) = 0, 1 ≤ k ≤ qn − 1,<br />

and the synchronization conditions (S) are satisfied.<br />

, and so does g(i)<br />

n . Hence in any case<br />

Now it remains to estimate the norm of the function gn. First, using Lemma 8.9,<br />

one finds<br />

<br />

<br />

|gn|α,L1 ≤ ϕ −<br />

α,(4σ) 1 α L1 −<br />

α,(4σ) 1 |ηpn+3 ◦ τMn|α,L1|ηpn+4|α,L1|ηp2n|α,L1,<br />

α L1<br />

which by definition of c1 gives<br />

|gn|α,L1 ≤ c1|ηpn+3 ◦ τMn|α,L1|ηpn+4|α,L1|ηp2n|α,L1.<br />

Then, by definition of Λ (Lemma 8.5) and using Lemma 8.10 (with Λ1 = Λ 1<br />

α),<br />

|ηpn+3 ◦ τMn|α,L1 ≤ |ηpn+3| α,Λ 1 α ,<br />

<br />

so using Lemma 8.6 and setting c2 = 2α sup Λ 1<br />

<br />

α, L1 (2π) 1<br />

α, this gives<br />

Finally, by definition of Mn we obtain<br />

and as qn = NnMn, we end up with<br />

This concludes the proof.<br />

|gn|α,L1 ≤ c1e 2α(Λ 1 α +(n−3)L1)(2πp2n) 1 α<br />

<br />

2α sup Λ<br />

≤ c1e 1 <br />

α ,L1 (n−2)(2πp2n) 1 α<br />

≤ c1e c2(n−2)p 1 α 2n.<br />

|gn|α,L1 ≤ MnN −1<br />

n ,<br />

q −1<br />

n |gn|α,L1 ≤ N −2<br />

n .


204 Time of instability for high-dimensional Hamiltonian systems<br />

8.3.2 Proof of Theorem 8.1<br />

8.3.2.1. In the previous section, we were concerned with a perturbation of the integrable<br />

diffeomorphism Φ h , which can be written as Φ u ◦ Φ h+v . So now we will briefly<br />

describe a suspension argument to go from this discrete case to a continuous case (we<br />

refer once again to [MS02] for the details).<br />

Here we will make use of bump functions, however the process is still valid, though<br />

more difficult, in the analytic category, (see for example [Dou88] or [KP94]). The basic<br />

idea is to find a time-dependent Hamiltonian function on A n such that the time-one<br />

map of its isotopy is Φ u ◦ Φ h+v , or, equivalently, an autonomous Hamiltonian function<br />

on A n+1 such that its first return map to some 2n-dimensional Poincaré section coincides<br />

with our map Φ u ◦ Φ h+v .<br />

Given α > 1 and L > 1, let us define the function<br />

−1 φα,L =<br />

ϕα,L<br />

T<br />

ϕα,L,<br />

<br />

1<br />

where ϕα,L is the bump function given by Lemma 8.8. If φ0(t) = φα,L<br />

t − and 4<br />

3<br />

φ1(t) = φα,L t − , the time-dependent Hamiltonian<br />

clearly satisfies<br />

4<br />

H ∗ (θ, I, t) = (h(I) + v(θ)) ⊗ φ0(t) + u(θ) ⊗ φ1(t)<br />

Φ H∗<br />

= Φ u ◦ Φ h+v .<br />

But as u and v go to zero, H ∗ converges to h⊗φ0 rather than h. However, using classical<br />

generating functions, it is not difficult to modify the Hamiltonian in order to prove the<br />

following proposition (see Lemma 2.5 in [MS02]).<br />

Lemma 8.7. Let n ≥ 1, R > 1, α > 1, L1 > 0 and L > 0 satisfying<br />

L α 1 = Lα (1 + (L α + R + 1/2)|φα,L|α,L). (80)<br />

If u, v ∈ G α,L1 (T n−1 ), there exists f ∈ G α,L (T n × B), independent of the variable In,<br />

such that if<br />

H(θ, I) = 1<br />

2 (I2 1 + · · · + I 2 n−1) + In + f(θ, I), (θ, I) ∈ A n ,<br />

for any energy e ∈ R, the Poincaré map induced by the Hamiltonian flow of H on the<br />

section {θn = 0} ∩ H −1 (e) coincides with the diffeomorphism<br />

Moreover, one has<br />

Φ u ◦ Φ h+v .<br />

sup{|u| C 0, |v| C 0} ≤ |f|α,L ≤ c3 sup{|u|α,L1, |v|α,L1}, (81)<br />

where c3 = 2|φα,L|α,L depends only on α and L.<br />

8.3.2.2. Now we can finally prove our theorem.


8.3 - Proof of Theorem 8.1 205<br />

Proof of Theorem 8.1. Let R > 1, α > 1 and L > 0, and choose L1 satisfying the<br />

relation (80). The constants c1 and c2 of Proposition 8.1 depend only on α and L1,<br />

hence they depend only on R, α and L.<br />

We can define un, vn ∈ G α,L1 (T n−1 ) by<br />

un = q −1<br />

n U ⊗ gn, vn = N −2<br />

n V,<br />

where U(θ1) = −(2π) −1 sin 2πθ1, V (θ2) = − cos 2πθ2 (so vn is formally defined on T but<br />

we identify it with a function on T n−1 ) and gn is the function given by Proposition 8.1.<br />

Let us apply Lemma 8.7: there exists fn ∈ G α,L (T n × B), independent of the variable<br />

In, such that if<br />

Hn(θ, I) = 1<br />

2 (I2 1 + · · · + I 2 n−1) + In + fn(θ, I), (θ, I) ∈ A n ,<br />

for any energy e ∈ R, the Poincaré map induced by the Hamiltonian flow of H on the<br />

section {θn = 0} ∩ H −1 (e) coincides with the diffeomorphism<br />

Φ un ◦ Φ 1<br />

2 (I2 1 +···+I2 n−1 )+vn = Φ un ◦ Φ h+vn .<br />

Let us show that our system Hn has a drifting orbit. First consider its Poincaré section<br />

defined by<br />

with<br />

Ψn = Φ un ◦ Φ 1<br />

2 (I2 1 +···+I2 n−1 )+vn = Φ fn⊗gn ◦ (F × Gn),<br />

fn = q −1<br />

n U, F = Φ12<br />

I2 1, Gn = Φ 1<br />

2 (I2 2 +I2 3 +···+I2 −2<br />

n−1 )+Nn V (θ2)<br />

.<br />

By Proposition 8.1, we can apply the coupling lemma (Lemma 8.3), so<br />

Then, using (77), observe that<br />

so<br />

Ψ qn<br />

n ((0, 0), an) = (Φ f n ◦ F qn (0, 0), an).<br />

Φ fn ◦ F qn = Φ q−1<br />

n U ◦<br />

<br />

Φ 1<br />

2I2 qn 1 = ψqn,<br />

Ψ q2 n<br />

n ((0, 0), an) = ((Φ fn ◦ F qn ) qn (0, 0), an)<br />

= (ψ qn<br />

(0, 0), an)<br />

qn<br />

= ((0, 1), an),<br />

where the last equality follows from (76). Hence, after q2 n iterations, the I1-component<br />

of the point xn = ((0, 0), an) ∈ An−1 drifts from 0 to 1. Then, for the continuous system,<br />

the initial condition (xn, t = 0, In = 0) in An gives rise to a solution (x(t), t, In(t)) =<br />

(x(t), θn(t), In(t)) of the Hamiltonian vector field generated by Hn such that<br />

x(k) = Ψ k n (xn), k ∈ Z.<br />

So after a time τn = q 2 n, the point (xn, (0, 0)) drifts from 0 to 1 in the I1-direction, and<br />

this gives our drifting orbit.<br />

Now let εn = |fn|α,L1 be the size of our perturbation. Using the estimate (79)<br />

and (81) one finds<br />

N −2<br />

. (82)<br />

n ≤ εn ≤ c3N −2<br />

n


206 Time of instability for high-dimensional Hamiltonian systems<br />

By the prime number theorem, pn is equivalent to n lnn, so there exists n0 ∈ N ∗ such<br />

that for n ≥ n0, one can ensure that<br />

which gives<br />

p2n/4 ≤ pn+i ≤ p2n, 3 ≤ i ≤ n,<br />

(p2n/4) n−2 ≤ Nn ≤ p n−2<br />

2n , N 1<br />

n<br />

n−2<br />

≤ p2n ≤ 4N 1<br />

n−2<br />

n . (83)<br />

We can also assume by the prime number theorem that for n ≥ n0, one has<br />

2n ln2n ≤ p2n ≤ 2(2n ln 2n) = 4n ln2n. (84)<br />

From the above estimates (83) and (84) one easily obtains<br />

and, together with (82), one finds<br />

e (n−2) ln(2−1 n ln2n) ≤ Nn ≤ e (n−2) ln(4n ln2n) , (85)<br />

e −2(n−2) ln(4n ln2n) ≤ εn ≤ c3e −2(n−2) ln(2−1 nln 2n) . (86)<br />

Finally it remains to estimate the time τn. First recall that<br />

and with (83)<br />

Hence<br />

Then using (85) we have<br />

and from (82) we know that<br />

so we obtain<br />

Mn = 2<br />

<br />

c1Nne c2(n−2)p 1 α 2n<br />

qn = NnMn ≤ 3c1N 2 ne 4c2(n−2)N<br />

q 2 n ≤ 9c2 1 N4 n e8c2(n−2)N<br />

N<br />

q 2 n ≤ 9c2 1<br />

1<br />

α(n−2)<br />

n<br />

N 4 n ≤<br />

c3<br />

εn<br />

<br />

,<br />

1<br />

α(n−2)<br />

n .<br />

≤ (4n ln 2n) 1<br />

α<br />

c3<br />

εn<br />

2<br />

,<br />

1<br />

α(n−2)<br />

n .<br />

2<br />

e 8c2(n−2)(4n ln2n) 1 α .<br />

Now taking n0 larger if necessary, as α > 1, one can ensure that for n ≥ n0,<br />

so<br />

Therefore<br />

(4n) 1<br />

α ≤ n, (ln 2n) 1<br />

α ≤ ln(2 −1 n ln 2n),<br />

8c2(n − 2)(4n ln2n) 1<br />

α ≤ 8c2(n − 2)n ln(2 −1 n ln 2n).<br />

q 2 n ≤ 9c 2 <br />

c3<br />

1<br />

εn<br />

≤ 9c 2 <br />

c3<br />

1<br />

εn<br />

2<br />

e 8c2n(n−2)ln(2 −1 n ln 2n)<br />

2 <br />

e 2(n−2) ln(2−1 n ln2n))<br />

4c2n<br />

.


8.A - Gevrey functions 207<br />

Finally by (86) we obtain<br />

q 2 n ≤ 9c2 1<br />

c3<br />

<br />

c<br />

≤ C<br />

εn<br />

nγ εn<br />

2 c3<br />

εn<br />

4c2n<br />

with C = 9c 2 1 , c = c3 and γ = 2 + 4c2. This ends the proof.<br />

8.A Gevrey functions<br />

In this very short appendix, we recall some facts about Gevrey functions that we used<br />

in the text. We refer to [MS02], Appendix A, for more details.<br />

The most important property of α-Gevrey functions is the existence, for α > 1, of<br />

bump functions.<br />

Lemma 8.8. Let α > 1 and L > 0. There exists a non-negative 1-periodic function<br />

ϕα,L ∈ Gα,L [−1 1 , 2 2 ] whose support is included in [−1 1 , 4 4 ] and such that ϕα,L(0) = 1 and<br />

(0) = 0.<br />

ϕ ′ α,L<br />

The following estimate on the product of Gevrey functions follows easily from the<br />

Leibniz formula.<br />

Lemma 8.9. Let L > 0, and f, g ∈ G α,L (T n × B). Then<br />

|fg|α,L ≤ |f|α,L|g|α,L.<br />

Finally, estimates on the composition of Gevrey functions are much more difficult<br />

(see Proposition A.1 in [MS02]), but here we shall only need the following statement.<br />

Lemma 8.10. Let α ≥ 1, Λ1 > 0, L1 > 0, and I, J be compact intervals of R. Let<br />

f ∈ G α,Λ1 (I), g ∈ G α,L1 (J) and assume g(J) ⊆ I. If<br />

then f ◦ g ∈ G α,L1 (J) and<br />

|g|α,L1 ≤ Λ α 1 ,<br />

|f ◦ g|α,L1 ≤ |f|α,Λ1.


208 Time of instability for high-dimensional Hamiltonian systems


References<br />

[AA89] V.I. Arnold and A. Avez, Ergodic problems of classical mechanics. Transl.<br />

from the French. Reprint, Advanced Book Classics. Redwood City, CA etc.:<br />

Addison-Wesley Publishing Company, Inc. xvii, 286 pp., 1989.<br />

[AKN06] V.I. Arnold, V.V. Kozlov, and A.I. Neishtadt, Mathematical aspects of classical<br />

and celestial mechanics, publisher = [Dynamical Systems III], Transl.<br />

from the Russian original by E. Khukhro, Third edition, Encyclopedia of<br />

Mathematical Sciences 3, Springer-Verlag, Berlin„ 2006.<br />

[Alb07] J. Albrecht, On the existence of invariant tori in nearly-integrable Hamiltonian<br />

systems with finitely differentiable perturbations, Regul. Chaotic Dyn.<br />

12 (2007), no. 3, 281–320.<br />

[AR67] R. Abraham and J. Robbin, Transversal mappings and flows, Benjamin,<br />

New-York, 1967.<br />

[Arn61] V.I. Arnol’d, The stability of the equilibrium position of a Hamiltonian system<br />

of ordinary differential equations in the general elliptic case, Sov. Math.,<br />

Dokl. 2 (1961), 247–249.<br />

[Arn63a] , Proof of a theorem of A.N. Kolmogorov on the invariance of quasiperiodic<br />

motions under small perturbations, Russ. Math. Surv. 18 (1963),<br />

no. 5, 9–36.<br />

[Arn63b] , Small denominators and problems of stability of motion in classical<br />

and celestial mechanics, Russ. Math. Surv. 18 (1963), no. 6, 85–191.<br />

[Arn64] V.I. Arnold, Instability of dynamical systems with several degrees of freedom,<br />

Sov. Math. Doklady 5 (1964), 581–585.<br />

[Bam99] D. Bambusi, Nekhoroshev theorem for small amplitude solutions in nonlinear<br />

Schrödinger equations, Math. Z. 230 (1999), no. 2, 345–387.<br />

[BBB03] M. Berti, L. Biasco, and P. Bolle, Drift in phase space: a new variational<br />

mechanism with optimal diffusion time, J. Math. Pures Appl. (9) 82 (2003),<br />

no. 6, 613–664.<br />

[Ben05] G. Benettin, The elements of hamiltonian perturbation theory., Adv. Astron.<br />

Astrophys. (2005), 1–98.<br />

[Ber96] P. Bernard, Perturbation of a partially hyperbolic hamiltonian system (Perturbation<br />

d’un hamiltonien partiellement hyperbolique), C. R. Math. Acad.<br />

Sci. Paris 323 (1996), no. 2, 189–194.<br />

[Ber06] , Hamiltonian systems: Stability and Instability Theory, Encyclopedia<br />

of Mathematical Physics, editors: J.-P. Francoise, G.L. Naber and Tsou<br />

S.T. Oxford: Elsevier, volume 2, page 261, 2006.<br />

[Ber08] , The dynamics of pseudographs in convex Hamiltonian systems,<br />

Journal of the American Math. Society 21 (2008), no. 3, 615–669.


210 REFERENCES<br />

[Ber09] , Large normally hyperbolic cylinders in a priori stable Hamiltonian<br />

systems, Preprint (2009).<br />

[Bes96] U. Bessi, An approach to Arnold’s diffusion through the calculus of variations,<br />

Nonlinear Anal., Theory Methods Appl. 26 (1996), no. 6, 1115–1135.<br />

[Bes97a] , Arnold’s diffusion with two resonances, J. Differ. Equations 137<br />

(1997), no. 2, 211–239.<br />

[Bes97b] , Arnold’s example with three rotators, Nonlinearity 10 (1997), no. 3,<br />

763–781.<br />

[BG93] D. Bambusi and A. Giorgilli, Exponential stability of states close to resonance<br />

in infinite-dimensional hamiltonian systems, J. Statist. Phys. 71<br />

(1993), no. 3-4, 569–606.<br />

[BGG85] G. Benettin, L. Galgani, and A. Giorgilli, A proof of Nekhoroshev’s theorem<br />

for the stability times in nearly integrable Hamiltonian systems, Celestial<br />

Mech. 37 (1985), 1–25.<br />

[BGGS84] G. Benettin, L. Galgani, A. Giorgilli, and J.-M. Strelcyn, A proof of Kolmogorov’s<br />

theorem on invariant tori using canonical transformations defined<br />

by the Lie method, Il Nuovo Cimento B 79 (84), no. 2, 201–223.<br />

[Bir66] G.D. Birkhoff, Dynamical systems, American Mathematical Society, Providence,<br />

R.I., 1966.<br />

[BK87] D. Bernstein and A. Katok, Birkhoff periodic orbits for small perturbations of<br />

completely integrable Hamiltonian systems with convex Hamiltonian, Invent.<br />

Math. 88 (1987), 225–241.<br />

[BK05] J. Bourgain and V. Kaloshin, On diffusion in high-dimensional Hamiltonian<br />

systems, J. Funct. Anal. 229 (2005), no. 1, 1–61.<br />

[BM10] A. Bounemoura and J.-P. Marco, Improved exponential stability for quasiconvex<br />

Hamiltonians, submitted (2010), http://arxiv.org/abs/1004.1462.<br />

[BN02] D. Bambusi and Nekhoroshev N.N., Long time stability in perturbations of<br />

completely resonant PDE’s, Acta Appl. Math. 70 (2002), no. 3, 1–22.<br />

[BN09] A. Bounemoura and L. Niederman, Generic Nekhoroshev theory without<br />

small divisors, submitted (2009), http://arxiv.org/abs/0912.3725.<br />

[Bos86] J.-B. Bost, Tores invariants des systemes dynamiques Hamiltoniens, Séminaire<br />

Bourbaki 133-134 (1986), 113–157.<br />

[Bou04] J. Bourgain, Remarks on stability and diffusion in high-dimensional Hamiltonian<br />

systems and partial differential equations, Erg. Th. Dyn. Sys. 24<br />

(2004), no. 5, 1331–1357.<br />

[Bou08] A. Bounemoura, The simplicity of surface transformation groups (Simplicité<br />

des groupes de transformations de surfaces), Ensaios Matemáticos 14. Rio<br />

de Janeiro: Sociedade Brasileira de Matemática. 147 pp. , 2008.


REFERENCES 211<br />

[Bou09a] , Arnold diffusion along two resonances, In preparation (2009).<br />

[Bou09b] , Generic super-exponential stability of invariant tori,<br />

accepted to Ergodic Theory and Dynamical Systems (2009),<br />

http://arxiv.org/abs/0912.3600.<br />

[Bou10a] , An example of instability in high-dimensional Hamiltonian systems,<br />

submitted (2010), http://arxiv.org/abs/1006.2296.<br />

[Bou10b] , Nekhoroshev theory for finitely differentiable quasi-convex Hamiltonians,<br />

Journal of Differential Equations 249 (2010), no. 11, 2905–2920.<br />

[Cas57] J.W.S. Cassels, An introduction to Diophantine approximations, Cambridge<br />

Tracts in Mathematics and Mathematical Physics, Cambridge University<br />

Press, 1957.<br />

[CC07] A. Celletti and L. Chierchia, KAM stability and celestial mechanics, Mem.<br />

Am. Math. Soc. 878 (2007), 134 pp.<br />

[CG82] L. Chierchia and G. Gallavotti, Smooth prime integrals for quasi-integrable<br />

Hamiltonian systems, Il Nuovo Cimento B, 67 (1982), no. 2, 277–295.<br />

[CG94] , Drift and diffusion in phase space, Ann. Inst. Henri Poincaré, Phys.<br />

Théor. 60 (1994), no. 1, 1–144.<br />

[CG03] J. Cresson and C. Guillet, Periodic orbits and Arnold diffusion, Discrete<br />

Contin. Dyn. Syst. 9 (2003), no. 2, 451–470 (English).<br />

[Cha04] M. Chaperon, Stable manifolds and the Perron-Irwin method, Erg. Th. Dyn.<br />

Sys. 24 (2004), no. 5, 1359–1394.<br />

[Cha08] , The lipschitzian core of some invariant manifold theorems, Erg. Th.<br />

Dyn. Sys. 28 (2008), no. 5, 1419–1441.<br />

[Che89] A. Chenciner, Intégration du problème de Kepler par la méthode de hamiltonjacobi<br />

: coordonnées action-angle de delaunay, Notes scientifiques et techniques<br />

du Bureau des Longitudes S026 (1989), Paris.<br />

[Che99] C.-Q. Cheng, Lower dimensional invariant tori in the regions of instability<br />

for nearly integrable Hamiltonian systems, Commun. Math. Phys. 203<br />

(1999), no. 2, 385–419.<br />

[Chi79] B.V. Chirikov, A universal instability of many-dimensional oscillator systems,<br />

Phys. Reports 52 (1979), 263–379.<br />

[Chr73] J.P.R. Christensen, On sets of Haar measure zero in abelian Polish groups,<br />

Isr. J. Math. 13 (1973), 255–260.<br />

[Cre97] J. Cresson, A λ-lemma for partially hyperbolic tori and the obstruction property.,<br />

Lett. Math. Phys. 42 (1997), no. 4, 363–377.<br />

[Cre01] , Time of instability for initially hyperbolic Hamiltonian systems<br />

(temps d’instabilité des systèmes hamiltoniens initialement hyperboliques),<br />

C. R. Math. Acad. Sci. Paris 332 (2001), no. 9, 831–834.


212 REFERENCES<br />

[CW99] C.-Q. Cheng and S. Wang, The surviving of lower dimensional tori from<br />

a resonant torus of Hamiltonian systems, J. Differ. Equations 155 (1999),<br />

no. 2, 311–326.<br />

[CY04] C.-Q. Cheng and J. Yan, Existence of diffusion orbits in a priori unstable<br />

Hamiltonian systems, J. Differ. Geom. 67 (2004), no. 3, 457–517.<br />

[CY09] , Arnold diffusion in Hamiltonian systems: a priori unstable case, J.<br />

Differ. Geom. 82 (2009), no. 2, 229–277.<br />

[DdlLS06] A. Delshams, R. de la Llave, and T.M. Seara, A geometric mechanism for diffusion<br />

in Hamiltonian systems overcoming in the large gap problem: Heuristics<br />

and rigorous verification on a model, Mem. Am. Math. Soc. 844 (2006),<br />

141 pp. (English).<br />

[DG96a] A. Delshams and P. Gutiérrez, Effective stability and KAM theory, J. Differ.<br />

Equations 128 (1996), no. 2, 415–490.<br />

[DG96b] , Estimates on invariant tori near an elliptic equilibrium point of a<br />

Hamiltonian system, J. Differ. Equations 131 (1996), 277–303.<br />

[DH09] A. Delshams and G. Huguet, Geography of resonances and Arnold diffusion<br />

in a priori unstable Hamiltonian systems, Nonlinearity 22 (2009), no. 8,<br />

1997–2077.<br />

[DLC83] R. Douady and P. Le Calvez, Exemple de point fixe elliptique non<br />

topologiquement stable en dimension 4, C. R. Acad. Sci. Paris 296 (1983),<br />

895–898.<br />

[dlL01] R. de la Llave, A tutorial on KAM theory, Katok, Anatole (ed.) et al.,<br />

Smooth ergodic theory and its applications (Seattle, WA, 1999). Providence,<br />

RI: Amer. Math. Soc. (AMS). Proc. Symp. Pure Math. 69, 175-292, 2001.<br />

[Dou88] R. Douady, Stabilité ou instabilité des points fixes elliptiques, Ann. Sci. Ec.<br />

Norm. Sup 21 (1988), no. 1, 1–46.<br />

[Dui80] J.J. Duistermaat, On global action-angle coordinates, Comm. Pure Appl.<br />

Math. 33 (1980), no. 6, 687–706.<br />

[Eas78] R. W. Easton, Homoclinic phenomena in Hamiltonian systems with several<br />

degrees of freedom, J. Differ. Equations 29 (1978), 241–252.<br />

[Eas81] , Orbit structure near trajectories biasymptotic to invariant tori,<br />

Classical mechanics and dynamical systems, NSF-CBMS reg. Conf., Tufts<br />

Univ. 1979, Lect. Notes Pure Appl. Math. 70, 55-67, 1981.<br />

[Eli88] L.H. Eliasson, Perturbations of stable invariant tori for Hamiltonian systems,<br />

Ann. Scuola Norm. Sup. Pisa 15 (1988), no. 1, 115–147.<br />

[Eli96] , Absolutely convergent series expansions for quasi periodic motions,<br />

Math. Phys. Electronic. J. (1996).<br />

[ESM10] L. El Sabbagh and J.P. Marco, A λ-lemma for normally hyperbolic invariant<br />

manifolds and some applications, In preparation (2010).


REFERENCES 213<br />

[Fas90] F. Fassò, Lie series method for vector fields and Hamiltonian perturbation<br />

theory, Z. Angew. Math. Phys. 41 (1990), no. 6, 843–864 (English).<br />

[FGB98] F. Fassò, M. Guzzo, and G. Benettin, Nekhoroshev-stability of elliptic equilibria<br />

of Hamiltonian systems, Comm. Math. Phys. 197 (1998), no. 2, 347–360.<br />

[Féj04] J. Féjoz, Démonstration du théorème d’Arnold sur la stabilité dy système<br />

planétaire (d’après Herman), Erg. Th. Dyn. Sys. 24 (2004), 1521–1582.<br />

[Féj10] , A simple proof of the invariant tori theorem, preprint (2010).<br />

[FM00] E. Fontich and P. Martín, Differentiable invariant manifolds for partially<br />

hyperbolic tori and a lambda lemma, Nonlinearity 13 (2000), no. 5, 1561–<br />

1593.<br />

[FM01] , Arnold diffusion in perturbations of analytic integrable Hamiltonian<br />

systems, Discrete Contin. Dyn. Syst. 7 (2001), no. 1, 61–84.<br />

[GDF + 89] A. Giorgilli, A. Delshams, E. Fontich, L. Galgani, and C. Simó, Effective<br />

stability for a Hamiltonian system near an elliptic equilibrium point, with<br />

an application to the restricted three body problem, J. Differ. Equations 77<br />

(1989), 167–198.<br />

[GFB98] M. Guzzo, F. Fassò, and G. Benettin, On the stability of elliptic equilibria,<br />

Math. Phys. Electron. J. 4 (1998), 16 pp., paper 1.<br />

[GG85] A. Giorgilli and L. Galgani, Rigorous estimates for the series expansions of<br />

Hamiltonian perturbation theory, Cel. Mech. 37 (1985), 95–112.<br />

[GR07] M. Gidea and C. Robinson, Shadowing orbits for transition chains of invariant<br />

tori alternating with Birkhoff zones of instability, Nonlinearity 20<br />

(2007), no. 5, 1115–1143.<br />

[GR09] , Obstruction argument for transition chains of tori interspersed with<br />

gaps, Discrete Contin. Dyn. Syst., Ser. S 2 (2009), no. 2, 393–416.<br />

[Gra74] S. M. Graff, On the conservation of hyperbolic invariant tori for Hamiltonian<br />

systems, J. Differ. Equations 15 (1974), 1–69.<br />

[Hal95] G. Haller, Diffusion at intersecting resonances in Hamiltonian systems,<br />

Phys. Lett., A 200 (1995), no. 1, 34–42.<br />

[Hal97] , Universal homoclinic bifurcations and chaos near double resonances,<br />

J. Stat. Phys. 86 (1997), no. 5-6, 1011–1051.<br />

[Her86] M.-R. Herman, Sur les courbes invariantes par les difféomorphismes de<br />

l’anneau, Vol. 2. (French) With a correction to: On the curves invariant<br />

under diffeomorphisms of the annulus, Vol. 1 (French), Astérisque No. 144,<br />

248 pp., 1986.<br />

[Her98] M. Herman, Some open problems in dynamical systems, Doc. Math., J.<br />

DMV, Extra Vol. ICM Berlin 1998, vol. II, 1998, pp. 797–808.


214 REFERENCES<br />

[HPS77] M.W. Hirsch, C.C. Pugh, and M. Shub, Invariant Manifolds, Springer,<br />

Berlin, 1977.<br />

[HSY92] B.R. Hunt, T. Sauer, and J.A. Yorke, Prevalence: a translation-invariant<br />

“almost every" on infinite-dimensional spaces, Bull. of the Amer. Math. Soc.<br />

27 (1992), 217–238.<br />

[HZ94] H. Hofer and E. Zehnder, Symplectic invariants and Hamiltonian dynamics,<br />

Birkhäuser Advanced Texts Verlag, Basel, 1994.<br />

[Ily86] I.S. Ilyashenko, A steepness test for analytic functions, Russian Math. Surveys<br />

41 (1986), 229–230.<br />

[JV97] À. Jorba and J. Villanueva, On the normal behaviour of partially elliptic<br />

lower dimensional tori of Hamiltonian systems, Nonlinearity 10 (1997), 783–<br />

822.<br />

[KH08] V. Kaloshin and B. Hunt, Prevalence, to appear in Handbook of Dynamical<br />

Systems, Volume 3, edited by Henk Broer, Boris Hasselblatt and Floris<br />

Takens, 2008.<br />

[KL08a] V. Kaloshin and M. Levi, An example of Arnold diffusion for near-integrable<br />

Hamiltonians, Bull. Amer. Math. Soc. (N.S.) 45 (2008), no. 3, 409–427.<br />

[KL08b] , Geometry of Arnold diffusion, SIAM Rev. 50 (2008), no. 4, 702–720.<br />

[KLDM06] K. Khanin, J. Lopes Dias, and J. Marklof, Renormalization of multidimensional<br />

Hamiltonian flows, Nonlinearity 19 (2006), no. 12, 2727–2753.<br />

[KLDM07] , Multidimensional continued fractions, dynamical renormalization<br />

and KAM theory, Comm. Math. Phys. 270 (2007), no. 1, 197–231.<br />

[KLS10] V. Kaloshin, M. Levi, and M. Saprykina, An example of nearly integrable<br />

Hamiltonian system with a trajectory dense in a set of maximal Hausdorff<br />

dimension, Preprint (2010).<br />

[KMV04] V. Kaloshin, J. N. Mather, and E. Valdinoci, Instability of resonant totally<br />

elliptic points of symplectic maps in dimension 4, Loday-Richaud, Michèle<br />

(ed.), Analyse complexe, systèmes dynamiques, sommabilité des séries divergentes<br />

et théories galoisiennes. II. Volume en l’honneur de Jean-Pierre<br />

Ramis. Paris: Société Mathématique de France. Astérisque 297, 79-116 ,<br />

2004.<br />

[Kol54] A.N. Kolmogorov, On the preservation of conditionally periodic motions for<br />

a small change in Hamilton’s function, Dokl. Akad. Nauk. SSSR 98 (1954),<br />

527–530.<br />

[Koz96] V.V. Kozlov, Symmetries, Topology, and Resonance in Hamiltonian Mechanics,<br />

Springer-Verlag, Berlin, Heidelberg, 1996.<br />

[KP94] S. Kuksin and J. Pöschel, On the inclusion of analytic symplectic maps<br />

in analytic Hamiltonian flows and its applications, Seminar on dynamical<br />

systems (1994), 96–116, Birkhäuser, Basel.


REFERENCES 215<br />

[KT09] K. Khanin and A. Teplinsky, Herman’s theory revisited, Invent. Math. 178<br />

(2009), no. 2, 333–344.<br />

[KZZ09] V. Kaloshin, K. Zhang, and Y. Zheng, Almost dense orbit on energy surface,<br />

Proceedings of the XVI-th ICMP, Prague, 314-322, 2009.<br />

[Lan02] S. Lang, Algebra, Graduate Texts in Mathematics 211, Springer Verlag,<br />

New-York, 2002.<br />

[LM88] P. Lochak and C. Meunier, Multiphase averaging for classical systems. With<br />

applications to adiabatic theorems. Transl. from the French by H. S. Dumas,<br />

Applied Mathematical Sciences, 72, New York etc, Springer-Verlag. xi, 360<br />

pp., 1988.<br />

[LM01] M. Levi and J. Moser, A Lagrangian proof of the invariant curve theorem for<br />

twist mappings, Katok, Anatole (ed.) et al., Smooth ergodic theory and its<br />

applications (Seattle, WA, 1999). Providence, RI: Amer. Math. Soc. (AMS).<br />

Proc. Symp. Pure Math. 69, 733-746, 2001.<br />

[LM05] P Lochak and J.P. Marco, Diffusion times and stability exponents for nearly<br />

integrable analytic systems, Central European Journal of Mathematics 3<br />

(2005), no. 3, 342–397.<br />

[LMS03] P. Lochak, J.-P. Marco, and D. Sauzin, On the splitting of invariant manifolds<br />

in multidimensional near-integrable Hamiltonian systems, Mem. Am.<br />

Math. Soc. 775 (2003), 145 pp.<br />

[LN92] P. Lochak and A.I. Neishtadt, Estimates of stability time for nearly integrable<br />

systems with a quasiconvex Hamiltonian, Chaos 2 (1992), no. 4, 495–499.<br />

[LNN94] P. Lochak, A.I. Neistadt, and L. Niederman, Stability of nearly integrable<br />

convex Hamiltonian systems over exponentially long times., Kuksin, S. (ed.)<br />

et al., Seminar on dynamical systems. Basel: Birkhäuser. Prog. Nonlinear<br />

Differ. Equ. Appl. 12, 15-34, 1994.<br />

[Loc92] P. Lochak, Canonical perturbation theory via simultaneous approximation.,<br />

Russ. Math. Surv. 47 (1992), no. 6, 57–133.<br />

[Loc93] , Hamiltonian perturbation theory: periodic orbits, resonances and<br />

intermittency., Nonlinearity 6 (1993), no. 6, 885–904.<br />

[Loc95] Pierre Lochak, Stability of Hamiltonian systems over exponentially long<br />

times: The near-linear case, Dumas, H. S. (ed.) et al., Hamiltonian dynamical<br />

systems: history, theory, and applications. Proceedings of the international<br />

conference held at the University of Cincinnati, OH (USA), March<br />

1992. New York, NY: Springer-Verlag. IMA Vol. Math. Appl. 63, 221-229,<br />

1995.<br />

[Loc99] , Arnold diffusion ; a compendium of remarks and questions., Simó,<br />

Carles (ed.), Hamiltonian systems with three or more degrees of freedom.<br />

Proceedings of the NATO Advanced Study Institute, 1995. Dordrecht:<br />

Kluwer Academic Publishers., 1999.


216 REFERENCES<br />

[Mar96] J.-P. Marco, Transition le long de chaînes de tores invariants pour les systèmes<br />

Hamiltoniens analytiques, Ann. Henri Poincaré 64 (1996), no. 2, 205–<br />

252.<br />

[Mar05] , Uniform lower bounds of the splitting for analytic symplectic systems,<br />

preprint (2005).<br />

[Mar08] , Models for skew-products and polysystems, C. R. Math. Acad. Sci.<br />

Paris (2008), no. 3-4, 203–208.<br />

[Mat91] J. N. Mather, Action minimizing invariant measures for positive definite<br />

Lagrangian systems, Math. Z. 207 (1991), no. 2, 169–207.<br />

[Mat93] , Variational construction of connecting orbits, Ann. Inst. Fourier 43<br />

(1993), no. 5, 1349–1386.<br />

[Mat04] J. Mather, Arnold diffusion I : Announcement of results, J. of Math. Sciences<br />

124 (2004), no. 5, 5275–5289.<br />

[Mel65] V.K. Melnikov, On some cases of conservation of conditionally periodic motions<br />

under a small change of the Hamilton function, Soviet Math. Doklady<br />

6 (1965), 1592–1596.<br />

[MF78] A.S. Mishchenko and A.T. Fomenko, Generalized Liouville method of integration<br />

of Hamiltonian systems, Funct. Anal. Appl. 12 (1978), 113–121.<br />

[MG95] A. Morbidelli and A. Giorgilli, Superexponential stability of KAM tori, J.<br />

Stat. Phys. 78 (1995), 1607–1617.<br />

[MG96] A. Morbidelli and M. Guzzo, The Nekhoroshev theorem and the asteroid belt<br />

dynamical system, Celestial Mech. Dynam. Astronom. 65 (1996), no. 1-2,<br />

107–136.<br />

[Moe02] R. Moeckel, Generic drift on Cantor sets of annuli, Chenciner, Alain (ed.)<br />

et al., Celestial mechanics. Dedicated to Donald Saari for his 60th birthday.<br />

Proceedings of an international conference, Northwestern Univ., Evanston,<br />

IL, USA, December 15–19, 1999. Providence, RI: AMS, American Mathematical<br />

Society. Contemp. Math. 292, 163-171, 2002.<br />

[Mos62] J. Moser, On Invariant curves of Area-Preserving Mappings of an Annulus,<br />

Nachr. Akad. Wiss. Göttingen II (1962), 1–20.<br />

[Mos67] , Convergent series expansions for quasi-periodic motions, Math.<br />

Ann. 169 (1967), 136–176.<br />

[Mos73] , Stable and random motions in dynamical systems. With special emphasis<br />

on celestial mechanics. Hermann Weyl Lectures. The Institute for<br />

Advanced Study, Annals of Mathematics Studies. No.77. Princeton, N. J.:<br />

Princeton University Press and University of Tokyo Press. VIII, 199 pp.,<br />

1973.<br />

[MS02] J.-P. Marco and D. Sauzin, Stability and instability for Gevrey quasi-convex<br />

near-integrable Hamiltonian systems, Publ. Math. Inst. Hautes Études Sci.<br />

96 (2002), 199–275.


REFERENCES 217<br />

[MS04] , Wandering domains and random walks in Gevrey near-integrable<br />

systems, Erg. Th. Dyn. Sys. 5 (2004), 1619–1666.<br />

[Nei84] A.I. Neishtadt, The separation of motions in systems with rapidly rotating<br />

phase, J. Appl. Math. Mech. 48 (1984), no. 2, 133–139.<br />

[Nek72] N.N. Nekhoroshev, Action-angle variables and their generalizations, Trans.<br />

Mosc. Math. Soc. 26 (1972), 180–198.<br />

[Nek77] , An exponential estimate of the time of stability of nearly integrable<br />

Hamiltonian systems, Russian Math. Surveys 32 (1977), no. 6, 1–65.<br />

[Nek79] , An exponential estimate of the time of stability of nearly integrable<br />

Hamiltonian systems II, Trudy Sem. Petrovs 5 (1979), 5–50.<br />

[Nie96] L. Niederman, Stability over exponentially long times in the planetary problem,<br />

Nonlinearity 9 (1996), no. 6, 1703–1751.<br />

[Nie98] , Nonlinear stability around an elliptic equilibrium point in a Hamiltonian<br />

system, Nonlinearity 11 (1998), no. 6, 1465–1479.<br />

[Nie04] , Exponential stability for small perturbations of steep integrable<br />

Hamiltonian systems, Erg. Th. Dyn. Sys. 24 (2004), no. 2, 593–608.<br />

[Nie06] , Hamiltonian stability and subanalytic geometry, Ann. Inst. Fourier<br />

56 (2006), no. 3, 795–813.<br />

[Nie07a] , Generic exponential stability of quadratic integrable Hamiltonian<br />

systems, Unpublished (2007).<br />

[Nie07b] , Prevalence of exponential stability among nearly integrable Hamiltonian<br />

systems, Erg. Th. Dyn. Sys. 27 (2007), no. 3, 905–928.<br />

[Nie09] , Nekhoroshev theory, Springer Encyclopedia of Complexity and Systems<br />

Science, 2009.<br />

[OY05] W. Ott and J.A. Yorke, Prevalence, Bull. of the Amer. Math. Soc. 42 (2005),<br />

no. 3, 263–290.<br />

[PM03] R. Pérez-Marco, Convergence or generic divergence of the Birkhoff normal<br />

form, The Annals of Mathematics 157 (2003), no. 2, 557–574.<br />

[Pop00] G. Popov, Invariant tori, effective stability, and quasimodes with exponentially<br />

small error terms. I. Birkhoff normal forms, Ann. Henri Poincaré 1<br />

(2000), no. 2, 223–248.<br />

[Pop04] , KAM theorem for Gevrey Hamiltonians, Erg. Th. Dyn. Sys. 24<br />

(2004), no. 5, 1753–1786.<br />

[Pös82] J. Pöschel, Integrability of Hamiltonian systems on Cantor sets, Comm. Pure<br />

Appl. Math. 35 (1982), no. 5, 653–696.<br />

[Pös89] , On elliptic lower dimensional tori in Hamiltonian systems, Math.<br />

Z. 202 (1989), no. 4, 559–608.


218 REFERENCES<br />

[Pös93] , Nekhoroshev estimates for quasi-convex Hamiltonian systems,<br />

Math. Z. 213 (1993), 187–216.<br />

[Pös99a] , On Nekhoroshev estimates for a nonlinear Schrödinger equation and<br />

a theorem by Bambusi, Nonlinearity 12 (1999), no. 6, 1587–1600.<br />

[Pös99b] , On Nekhoroshev’s estimate at an elliptic equilibrium, Internat.<br />

Math. Res. Notices 4 (1999), 203–215.<br />

[Pös01] , A lecture on the classical KAM theory, Katok, Anatole (ed.) et al.,<br />

Smooth ergodic theory and its applications (Seattle, WA, 1999). Providence,<br />

RI: Amer. Math. Soc. (AMS). Proc. Symp. Pure Math. 69, 707-732, 2001.<br />

[Pös09] , KAM à la R, Preprint (2009).<br />

[PT97] A.V. Pronin and D.V. Treschev, On the inclusion of analytic maps into<br />

analytic flows, Regular and Chaotic Dynamics 2 (1997), no. 2, 14–24.<br />

[PT07] G.N. Piftankin and D.V. Treshchev, Separatrix maps in Hamiltonian systems,<br />

Russ. Math. Surv. 62 (2007), no. 2, 219–322.<br />

[PW94] A.D. Perry and S. Wiggins, KAM tori are very sticky: rigorous lower bounds<br />

on the time to move away from an invariant Lagrangian torus with linear<br />

flow, Phys. D 71 (1994), 102–121.<br />

[RS96] J.-P. Ramis and R. Schäfke, Gevrey separation of fast and slow variables,<br />

Nonlinearity 9 (1996), no. 2, 353–384.<br />

[Rüs01] H. Rüssmann, Invariant tori in non-degenerate nearly integrable Hamiltonian<br />

systems, Regul. Chaotic Dyn. 6 (2001), no. 2, 119–204.<br />

[Rüs09] , Kam-iteration with nearly infinitely small steps in dynamical systems<br />

of polynomial character, Preprint (2009).<br />

[Sal04] D.A. Salamon, The Kolmogorov-Arnold-Moser theorem, Mathematical<br />

Physics Electronic Journal 10 (2004), 1–37.<br />

[Sau95] D. Sauzin, Résurgence paramétrique et exponentielle petitesse de l’écart des<br />

séparatrices du pendule rapidement forcé, Ann. Inst. Fourier 45 (1995), no. 2,<br />

453–511.<br />

[Sev06] Mikhail B. Sevryuk, Partial preservation of frequencies in KAM theory, Nonlinearity<br />

19 (2006), no. 5, 1099–1140.<br />

[SM71] C.L. Siegel and J.K. Moser, Lectures on Celestial Mechanics, Springer,<br />

Berlin, 1971.<br />

[Sor09] A. Sorrentino, On the integrability of Tonelli Hamiltonians, Preprint (2009).<br />

[SZ89] D.A. Salamon and E. Zehnder, KAM theory in configuration space, Comment.<br />

Math. Helv. 64 (1989), no. 1, 84–132.<br />

[Tre91] D.V. Treschev, The mechanism of destruction of resonance tori of Hamiltonian<br />

systems, Math. USSR Sb. 68 (1991), no. 1, 181–203.


REFERENCES 219<br />

[Tre04] , Evolution of slow variables in a priori unstable Hamiltonian systems,<br />

Nonlinearity 17 (2004), no. 5, 1803–1841.<br />

[YC04] Y. Yomdin and G. Comte, Tame geometry with application in smooth analysis,<br />

Lecture Notes in Mathematics, Springer Verlag, Berlin, 2004.<br />

[Yoc95] J.-C. Yoccoz, Introduction to hyperbolic dynamics, Branner, Bodil (ed.) et<br />

al., Real and complex dynamical systems. Proceedings of the NATO Advanced<br />

Study Institute held in Hillerød, Denmark, June 20-July 2, 1993.<br />

Dordrecht: Kluwer Academic Publishers. NATO ASI Ser., Ser. C, Math.<br />

Phys. Sci. 464, 265-291, 1995.<br />

[Yom83] Y. Yomdin, The geometry of critical and near-critical values of differentiable<br />

mappings, Math. Ann. 264 (1983), 495–515.<br />

[Zeh75] E. Zehnder, Generalized implicit function theorems with applications to some<br />

small divisor problems. I, Comm. Pure. Appl. Math. 28 (1975), 91–140.<br />

[Zeh76] , Generalized implicit function theorems with applications to some<br />

small divisor problems. II, Comm. Pure. Appl. Math. 28 (1976), 49–111.<br />

[Zha09] K. Zhang, Speed of Arnold diffusion for analytic Hamiltonian systems,<br />

Preprint (2009).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!