Les modÃ¨les non linÃ©aires Ã effets mixtes - Isped

Les modèles non linéaires à effets mixtes. 

Algorithmes et Applications 

Marc LAVIELLE 

Université René Descartes et Université Paris-Sud 

http://www.math.u-psud.fr/∼lavielle 

Sminaire ”Statistique et Sant Publique”, Bordeaux - 8 novembre 2005, p.1

Les modèles non linéaires à effets mixtes. 

Algorithmes et Applications 

Marc LAVIELLE 

Université René Descartes et Université Paris-Sud 

http://www.math.u-psud.fr/∼lavielle 

(en collaboration avec les membres du groupe MONOLIX) 


A pharmacokinetics example : theophylline 

12 patients: 

12 

1 

10 

8 

6 

4 

2 

0 

0 5 10 15 20 25 



12 patients: 

12 

1 

10 

8 

6 

4 

2 

0 

0 5 10 15 20 25 

On aimerait construire un modèle qui permette de décrire chaque 

courbe individuelle par le même modèle paramétrique, en considérant 

que chaque courbe est paramétrée par ses propres paramètres 

individuels 


One compartment model 

(oral administration, first-order absortion and elimination) 

D absortion (rate k a) 

−−−−−−−−−−−→ 

DRUG AMOUNT Q(t) 

elimination (rate k e ) 

−−−−−−−−−−−→ 





−−−−−−−−−−−→ 



−−−−−−−−−−−→ 

Q a (t): amount at absorption site. 

dQ 

dt (t) = k aQ a (t) − k e Q(t) 

dQ a 

dt (t) = −k aQ a (t) 





−−−−−−−−−−−→ 



−−−−−−−−−−−→ 

Q a (t): amount at absorption site. 

C(t) = Q(t) 

V 

dQ 

dt (t) = k aQ a (t) − k e Q(t) 

dQ a 

dt (t) = −k aQ a (t) 

= D k ak e 

Cl(k a − k e ) 

(e −k et − e −k at ) 

Cl, k a et k e sont des paramètres physiologiques individuels. 


oral administration, first-order absortion and elimination 


Viral load decrease during anti-HIV treatment 

15 patients: 

6 

6 

6 

6 

6 

5 

5 

5 

5 

5 

4 

4 

4 

4 

4 

3 

3 

3 

3 

3 

2 

2 

2 

2 

2 

1 

1 

1 

1 

1 

0 50 100 

0 50 100 

0 50 100 

0 50 100 

0 50 100 

6 

6 

6 

6 

6 

5 

5 

5 

5 

5 

4 

4 

4 

4 

4 

3 

3 

3 

3 

3 

2 

2 

2 

2 

2 

1 

1 

1 

1 

1 

0 50 100 

0 50 100 

0 50 100 

0 50 100 

0 50 100 

6 

6 

6 

6 

6 

5 

5 

5 

5 

5 

4 

4 

4 

4 

4 

3 

3 

3 

3 

3 

2 

2 

2 

2 

2 

1 

1 

1 

1 

1 

0 50 100 

0 50 100 

0 50 100 

0 50 100 

0 50 100 



4 

3.5 

3 

log(CV) (cp/ml) 

2.5 

2 

1.5 

1 

0.5 

0 20 40 60 80 100 120 

time (weeks) 



4 

3.5 

3 

log(CV) (cp/ml) 

2.5 

2 

1.5 

1 

0.5 

0 20 40 60 80 100 120 

time (weeks) 

L(t) = A 1 e −λ 1t + A 2 e −λ 2t 

; (λ 1 > λ 2 ) 

Initial rate constant of viral load decrease : λ 1 

Terminal rate constant of viral load decrease : λ 2 

A 1 , A 2 , λ 1 et λ 2 sont des paramètres physiologiques individuels. 


Crop yield responses to applied fertilizer on 37 site-year 

12 

1 

11 

10 

9 

8 

7 

6 

5 

4 

3 

0 50 100 150 200 250 300 350 


Evolution of the weight of 560 cows 

1200 

1 

1000 

800 

600 

400 

200 

0 

0 200 400 600 800 1000 1200 1400 1600 1800 2000 


Le modèle mixte 


courbe individuelle par le même modèle paramétrique, 





en considérant que chaque courbe est paramétrée par ses propres 

paramètres individuels, 





en considérant que chaque courbe est paramétrée par ses propres 

paramètres individuels, 

en considérant que ces paramètres individuels fluctuent autour d’un 

paramètre moyen de population. 


Les modèles à effets mixtes 


The (nonlinear) mixed effects model 



y ij = a(t ij , φ i ) + b(t ij , φ i )ε ij 

; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 

(y ij ) are the observations 

a et b are known regression functions. 

(t ij ) are the measurement times. 

(ε ij ) are intra-individual fluctuations. 

(φ i ) are the individual (random) parameters. 




; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 

(y ij ) are the observations 

a et b are known regression functions. 

(t ij ) are the measurement times. 

(ε ij ) are intra-individual fluctuations. 

(φ i ) are the individual (random) parameters. 

φ i ∼ i.i.d. π 



(model without covariate) 


; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 

φ i ∼ i.i.d. N (µ, Γ) 



(model without covariate) 


; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 

Fixed effects : µ 

Random effects : (φ i − µ). 

φ i ∼ i.i.d. N (µ, Γ) 



(model with covariate) 


; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 

φ i = A i µ + B i η i 

A i , B i : design matrix formed with covariates 

η i ∼ i.i.d. N (0, Γ) 


Random effects : (η i ) 





; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 



η i ∼ i.i.d. N (0, Γ) 



ε ij ∼ i.i.d. N (0, σ 2 ) 





; 1 ≤ i ≤ n ; 1 ≤ j ≤ n i 



η i ∼ i.i.d. N (0, Γ) 



ε ij ∼ i.i.d. N (0, σ 2 ) 

(Hyper)parameters of the model: θ = (µ, Γ, σ 2 ). 


Objectifs 

Estimation 

Estimer l’ensemble des paramètres du modèle θ 

Calculer un intervalle de confiance 


Objectifs 

Estimation 



Sélection de modèle 

Sélectionner le modèle de covariables 

Sélectionner le modèle de covariance des effets aléatoires 


Objectifs 

Estimation 






Tests d’hypothèses 

Comparer deux traitements 

Tester si un effet est fixe ou aléatoire 


Objectifs 

Estimation 






Tests d’hypothèses 

Comparer deux traitements 

Tester si un effet est fixe ou aléatoire 

Optimisation de plans d’expérience 

Determiner le design qui permet la meileure estimation du modèle 


Les modèles mixtes en plein boom ... 

Les modèles (non linéaires) à effets mixtes, sont utilisés aujourd’hui 

dans de très nombreux domaines d’application: 

pharmacologie, oncologie, neurosciences, agronomie, génétique 

animale, climatologie, écologie . . . 







Nombre d’articles référencés dans Medline avec le mot clé 

“mixed effects model” : 

40 entre 1980 et 1989 

303 entre 1990 et 1999 

523 depuis 2000 







Nombre d’articles référencés dans Medline avec le mot clé 

“mixed effects model” : 

40 entre 1980 et 1989 

303 entre 1990 et 1999 

523 depuis 2000 

Développement récents en statistique : 

Modèles à données incomplètes 

Sélection de modèles 

Optimisation de protocoles 


Le groupe de travail MONOLIX 

Groupe de travail pluridisciplinaire animé par France Mentré (Inserm et 

Université Paris 7) et Marc Lavielle depuis octobre 2003 et regoupant 

une vingtaine de participants (Inserm, INRA, INA-PG, CEMAGREF, 

Ecole Vétérinaire, Universités P5, P11, P13, Lyon 1) 







des collaborations en statistique: 

Modèles avec ODE, SDE 

Modèles avec données manquantes ou censurées 

Estimation REML 

Algorithme PX-SAEM 


Tests de validité 







des collaborations en statistique: 

Modèles avec ODE, SDE 

Modèles avec données manquantes ou censurées 

Estimation REML 

Algorithme PX-SAEM 


Tests de validité 

des applications en pharmacologie, en oncologie, en agronomie, en 

génétique animale,. . . 



des thèses en co-direction : 

Cristian Meza : M. Lavielle (P11) et J.L. Foulley (INRA) 

Adeline Samson : M. Lavielle (P11) et F. Mentré (INSERM) 






des publications communes : 

soumis ou à paraître dans : Biostatistics, Statistics in Medecine, Genetics, 

Journal of Agricultural Biological and Environmental Statistics, Journal of Statistics 

and Planning Inference, Statistics and Computation, Computational Statistics and 

Data Analysis, Journal of PK/PD... 











des congrès: 

SMAI 2005 (organisation d’un mini-symposium), 

SFdS 2005 (organisation d’une session spéciale), 

PAGE (Population Approach Group in Europe) 2004 et 2005, 











des congrès: 




le logiciel Monolix 











des congrès: 




le logiciel Monolix 

un projet ANR (programme non thématique, CSD 5) 


Les méthodes d’estimation 

existantes 


Some existing methods 



1. Methods based on individual estimates 

i) Estimate the individual parameters (φ i ), 

ii) Estimate θ using (̂φ i ). 



1. Methods based on individual estimates 

i) Estimate the individual parameters (φ i ), 

ii) Estimate θ using (̂φ i ). 

=⇒ Requires a large number of observations per subject. 



2. Methods based on approximations of the likelihood 




First order methods (FO, Beal and Sheiner, 1982) 

φ i = µ + η i 

y ij ≈ a(t ij , µ) + da 

dt a(t ij, µ)η i + b(t ij , µ)ε ij 

i) NONMEM package (very popular in pharmacokinetics) 

ii) SAS proc NLMIXED (using the method=firo option) 




First order methods (FO, Beal and Sheiner, 1982) 

φ i = µ + η i 

y ij ≈ a(t ij , µ) + da 

dt a(t ij, µ)η i + b(t ij , µ)ε ij 

i) NONMEM package (very popular in pharmacokinetics) 

ii) SAS proc NLMIXED (using the method=firo option) 

First order conditional methods (FOCE, Lindstrom and Bates, 1990) 

y ij ≈ a(t ij , ˆφ i ) + da 

dt a(t ij, ˆφ i )(φ i − ˆφ i ) + b(t ij , ˆφ i )ε ij 

ˆφ i maximizes the conditional distribution p(φ i |y i ; θ) 

i) NONMEM package (FOCE option) 

ii) SAS proc NLMIXED (using the method=eblup option) 

iii) Splus/R function NLME 



3. Methods based on the exact likelihood 

EM algorithm (Dempster et al, 1977) 

MCEM algorithm (Walker, 1996) 

SPML algorithm (Concordet and Nunez, 2002) 

SAEM algorithm (Delyon, Lavielle and Moulines, 1999 ; Kuhn and Lavielle, 2003) 

MC-PEM algorithm (Bauer and Guzy, 2004) 


The incomplete data model 

The complete likelihood f of (y, φ) belongs to a parametric family 

{f(y, φ; θ), θ ∈ Θ}. 




{f(y, φ; θ), θ ∈ Θ}. 

Objectives 

Compute the value θ ML that maximises the likelihood g(y; θ) of the 

observed data 




{f(y, φ; θ), θ ∈ Θ}. 

Objectives 


observed data 

Estimate the likelihood of the observations g(y; θ ML ). 




{f(y, φ; θ), θ ∈ Θ}. 

Objectives 


observed data 

Estimate the likelihood of the observations g(y; θ ML ). 

Estimate the Fisher information matrix −∂ 2 θ log g(y; θML ). 


The EM algorithm (Expectation-Maximization) 

(Dempster, Laird et Rubin, 1977) 

Complete-data model: (y, φ) ∼ f(·, · ; θ) ; only y is observed. 





Iteration k of the algorithm: 

step E : evaluate the quantity 

Q k (θ) = E[log f(y, φ; θ)|y; θ k−1 ] 






step E : evaluate the quantity 

Q k (θ) = E[log f(y, φ; θ)|y; θ k−1 ] 

step M : update the estimation of θ: 

θ k = Argmax Q k (θ) 


Convergence of EM 

Dempster et al. (1977), Wu (1983) 

Convergence of (θ k ) to a stationary point ̂θ g of the observed likelihood is 

ensured under some regularity conditions. 






Some practical drawbacks of EM: 

Nature of the limiting point. 

Convergence depends on the initial guess. 

Slow convergence of EM. 










Evaluation of Q k (θ) = E[log f(y, φ; θ)|y; θ k−1 ] during step E. 










Evaluation of Q k (θ) = E[log f(y, φ; θ)|y; θ k−1 ] during step E. 

=⇒ use a simulated sequence φ 


A Stochastic Approximation 

version of EM 


The SAEM algorithm (Stochastic Approximation of EM) 

Delyon, Lavielle and Moulines (1999) 





draw the non observed data φ k with the conditional distribution 

p Φ|Y ( · |y; θ k−1 ). 






p Φ|Y ( · |y; θ k−1 ). 

Stochastic approximation 

Q k (θ) = Q k−1 (θ) + γ k [log f(y, φ k ; θ) − Q k−1 (θ)] 

where (γ k ) is a decreasing sequence such that ∑ γ k = +∞ and 

∑ γ 

2 

k 

< +∞ . 






p Φ|Y ( · |y; θ k−1 ). 



where (γ k ) is a decreasing sequence such that ∑ γ k = +∞ and 

∑ γ 

2 

k 

< +∞ . 

Maximization : 



Convergence of SAEM 

Delyon, Lavielle et Moulines (1999) 

For exponential models, the sequence (θ k ) converges almost surely 

toward a stationary point ̂θ g of the observed likelihood g under very 

general conditions. 







Some weak hypothesis ensure the convergence to a (local) maximum of 

the likelihood 









Practical drawbacks: 












=⇒ use a simulated annealing version of SAEM 












Exact simulation of φ with the conditional distribution not always 

possible. 












Exact simulation of φ with the conditional distribution not always 

possible. 

=⇒ use a MCMC method. 


A Simulated Annealing version of SAEM 

Conditional distribution of φ: 

p Φ|Y ( φ |y; θ) = C(y; θ)e −U(φ,y;θ) 




Temperature parameter T : 


p (T ) 

Φ|Y ( φ |y; θ) = C T (y; θ)e − U(φ,y;θ) 

T 






p (T ) 


T 

Choose a decreasing Temperature sequence (T k ) that converges to 1. 

Then, at iteration k of SAEM, 

Simulation: draw the non observed data φ k with the conditional 

distribution p (T k) 

Φ|Y ( · |y; θ k−1) 






p (T ) 


T 

Choose a decreasing Temperature sequence (T k ) that converges to 1. 

Then, at iteration k of SAEM, 

Simulation: draw the non observed data φ k with the conditional 

distribution p (T k) 

Φ|Y ( · |y; θ k−1) 


Maximization 


Coupling SAEM and MCMC 

(Kuhn and Lavielle) 

Let Π θ be the transition probability of an ergodic Markov Chain with 

limiting distribution p Φ|Y (·|y; θ). 







step S : draw φ k according to the transition probability 

Π θk−1 (φ k−1 , ·). 







step S : draw φ k according to the transition probability 

Π θk−1 (φ k−1 , ·). 



Maximization : 



Convergence of the algorithm 


C1 The chain (φ k ) k≥0 takes its values in a compact subset E of R l . 

C2 For any compact subset V of Θ, there exists a real constant L such 

that for any (θ, θ ′ ) in V 2 

sup |Π θ (x, y) − Π θ ′(x, y)| ≤ L|θ − θ ′ |. 

(x,y)∈E 2 

C3 The transition probability Π θ generates a uniformly ergodic chain 

whose invariant probability is p(·|y; θ): there exists K θ ∈ R + and 

ρ θ ∈]0, 1[ such that 

∀φ ∈ E, ∀k ∈ N, ||Π k θ (φ, ·) − p(·|y; θ)|| T V ≤ K θ ρ k θ , 

K sup 

θ 

K θ < +∞ and ρ sup 

θ 

ρ θ < 1. 


Convergence of the algorithm 


Theorem 

- Assume that the regularity conditions required for the convergence 

of EM are satisfied 

- Assume that assumptions C1-C3 hold 

- Assume that for any θ ∈ Θ, the sequence (Q k (θ)) k≥0 takes its 

values in a compact subset of S. 

Then, w.p. 1, lim k→+∞ d(θ k , L) = 0 where d(x, A) denotes the distance 

of x to the closed subset A and L = {θ ∈ Θ, ∂ θ g(y; θ) = 0} is the set of 

stationary points of g. 

(Some weak hypothesis ensure the convergence to a (local) maximum 

of the likelihood) 


Estimation of the Fisher Information matrix 

An estimate of the asymptotic covariance matrix of ̂θ g is the inverse of 

the observed Fisher Information matrix : 

−∂ 2 θ log g(y; ̂θ g ) 



An estimate of the asymptotic covariance matrix of ̂θ g is the inverse of 

the observed Fisher Information matrix : 

−∂ 2 θ log g(y; ̂θ g ) 

Louis’s missing information principle (1982) 

∂θ 2 log g(y; θ) = E y;θ[∂θ 2 log f(y, φ; θ)] + Cov y;θ[∂ θ log f(y, φ; θ)] 

where 

Cov y;θ [∂ θ log f(y, φ; θ)] = E y;θ [ ( ∂ θ log f(y, φ; θ) )( ∂ θ log f(y, φ; θ) ) ′ ] 

− E y;θ [∂ θ log f(y, φ; θ)]E y;θ [∂ θ log f(y, φ; θ)] ′ 

and 

∂ θ log g(y; θ) = E y;θ [∂ θ log f(y, φ; θ)] 



Stochastic approximation: 

∆ k = ∆ k−1 + γ k [∂ θ log f(y, φ k ; θ k ) − ∆ k−1 ] 

D k = D k−1 + γ k 

[ 

∂ 

2 

θ log f(y, φ k ; θ k ) − Dk − 1 ] 

G k = G k−1 + γ k 

[ 

∂θ log f(y, φ k ; θ k )∂ θ log f(y, φ k ; θ k ) t − G k−1 

] 

H k = D k + G k − ∆ k ∆ t k 



Stochastic approximation: 

∆ k = ∆ k−1 + γ k [∂ θ log f(y, φ k ; θ k ) − ∆ k−1 ] 

D k = D k−1 + γ k 

[ 

∂ 

2 

θ log f(y, φ k ; θ k ) − Dk − 1 ] 

G k = G k−1 + γ k 

[ 

∂θ log f(y, φ k ; θ k )∂ θ log f(y, φ k ; θ k ) t − G k−1 

] 

H k = D k + G k − ∆ k ∆ t k 

Under some regularity conditions, the sequence (H k ) converges 

almost surely to the Fisher Information matrix 


Estimation of the likelihood 



Importance sampling: 

g(y) = 

= 

= 

∫ 

f(y, φ)dφ 

∫ 

h(y|φ)π(φ)dφ 

∫ ( 

h(y|φ) π(φ) ) 

˜π(φ) 

˜π(φ)dφ 




g(y) = 

= 

= 

∫ 

f(y, φ)dφ 

∫ 

h(y|φ)π(φ)dφ 

∫ ( 

h(y|φ) π(φ) ) 

˜π(φ) 

˜π(φ)dφ 

1) Draw φ (1) , φ (2) , . . . , φ (N) with the distribution ˜π, 

2) Let 

ĝ N (y) = 1 N∑ 

h(y|φ (j) ) π(φ(j) ) 

N 

˜π(φ (j) ) 

j=1 




ĝ N (y) = 1 N 

N∑ 

j=1 

h(y|φ (j) ) π(φ(j) ) 

˜π(φ (j) ) 

Eĝ N (y) = g(y) 

Varĝ N (y) = O(1/N) 




ĝ N (y) = 1 N 

N∑ 

j=1 

h(y|φ (j) ) π(φ(j) ) 

˜π(φ (j) ) 

Eĝ N (y) = g(y) 

Varĝ N (y) = O(1/N) 

Varĝ N (y) = 0 ⇐⇒ ˜π(φ) = p(φ|y) 

Then, 

1) Estimate the conditional mean and the conditional variance of φ 

(using MCMC) 

2) Use for ˜π a Gaussian distribution (or a mixture Gaussian/Cauchy) 

with these parameters 


Le logiciel MONOLIX 




Twelve subjects were given oral doses of the anti-asthmatic drug 

theophylline, then serum concentrations (in mg/L) were measured at 11 

time points over the next 25 hours. 

Subject i receive an initial dose D i at time 0 and serum concentrations 

(y ij ) are measured at time (t ij ). Serum concentration is modeled by a 

first-order one compartment model. Then, 

where 

y ij = 

D i ka i ke i 

Cl i (ka i − ke i ) 

ke i is the elimination rate of subject i, 

ka i is the absortion rate of subject i, 

Cl i is the clearance of subject i. 

(e −ke i t ij 

− e −ka i t ij 

) 

+ ε ij 




y ij = 

D i ka i ke i 




) 

+ ε ij 

For subject i, 

the vector of regression (or design) variables is x ij = (D i , t ij ), 

the vector of individual parameters is 

φ i = (log(ke i ), log(ka i ), log(Cl i )), 

the only available covariate is the weight w i . 


Modèle défini par un système 

d’équations différentielles 

ordinaires 


Modèle défini par un système d’EDO 

(Sophie DONNET et Adeline SAMSON) 

Exemple de modèle de pharmacocinétique (modèle de Michaelis 

Menten) à un compartiment avec absorption du premier ordre: 

dC 

dt (t, φ) = k a·Dose 

e −kat − V m·C(t, φ) 

V 

k m + C(t, φ) 

C(t, φ) (g/L) concentration du médicament 

Dose quantité connue fixée 

φ = (k m , V m , k a , V ) 

k m (g/L) constante d’élimination 

V m (g/L/h) volume maximal d’élimination 

k a (h −1 ) constante d’absorption de la molécule 

V (L) volume total de distribution de la molécule dans le corps 


Modèle défini par un système d’EDO 

(Sophie DONNET et Adeline SAMSON) 

Exemple de modèle de pharmacocinétique (modèle de Michaelis 

Menten) à un compartiment avec absorption du premier ordre: 

dC 



V 

k m + C(t, φ) 

C(t, φ) (g/L) concentration du médicament 

Dose quantité connue fixée 

φ = (k m , V m , k a , V ) 

k m (g/L) constante d’élimination 

V m (g/L/h) volume maximal d’élimination 

k a (h −1 ) constante d’absorption de la molécule 

V (L) volume total de distribution de la molécule dans le corps 

ODE sans solution analytique 


Modèle mixte avec équation différentielle 

Modèle statistique non linéaire à effets mixtes 

y ij = f(t ij , φ i ) + ε ij 1 ≤ j ≤ n i , 1 ≤ i ≤ N 

ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d. π φ (.; β) 


Modèle mixte avec équation différentielle 

Modèle statistique non linéaire à effets mixtes 

y ij = f(t ij , φ i ) + ε ij 1 ≤ j ≤ n i , 1 ≤ i ≤ N 

ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d. π φ (.; β) 

Fonction de régression f : 

∂f(t, φ) 

= F (f(t, φ), t, φ) 

∂t 

f(t 0 , φ) = f 0 


Schéma numérique 

Utilisation d’un algorithme MCMC dans toutes les méthodes 

d’estimation: 

⇒ Besoin de calculer la quantité f(φ, t) à chaque itération de 

l’algorithme MCMC 







Introduction d’une méthode numérique de résolution d’ODE 

Runge-Kutta, 

Linearisation Locale, 

... 







Introduction d’une méthode numérique de résolution d’ODE 

Runge-Kutta, 

Linearisation Locale, 

... 

Estimation dans un modèle approché (modèle où la fonction de 

régression est approximée numériquement) 


Utilisation d’un modèle approché 

Modèle M h 

y ij = f h (t ij , φ i ) + ε ij 1 ≤ j ≤ n i , 1 ≤ i ≤ N 

ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d. π φ (.; β) 

f h est l’évaluation de la solution de l’ODE par une méthode 

numérique d’ordre p et de pas h: 

sup 

t 

|f h (t, φ) − f(t, φ)| ≤ Ch p 



Modèle M h 


ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d. π φ (.; β) 



sup 

t 

|f h (t, φ) − f(t, φ)| ≤ Ch p 

On estime les paramètres du modèle M h 



Modèle M h 


ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d. π φ (.; β) 



sup 

t 

|f h (t, φ) − f(t, φ)| ≤ Ch p 

On estime les paramètres du modèle M h 

Quel contrôle a-t-on sur l’estimation des paramètres du modèle 

original ? 


Quelques propriétés des algorithmes 

Propriétés de l’algorithme MCMC : sous certaines hypothèses, 

l’algorithme MCMC converge dans le modèle M h , 

il existe une constante C telle que pour tout θ ∈ Θ et pour h petit, 

D(p φ|y (.; θ), p h,φ|y (.; θ)) ≤ Ch p 


Quelques propriétés des algorithmes 

Propriétés de l’algorithme MCMC : sous certaines hypothèses, 

l’algorithme MCMC converge dans le modèle M h , 

il existe une constante C telle que pour tout θ ∈ Θ et pour h petit, 

D(p φ|y (.; θ), p h,φ|y (.; θ)) ≤ Ch p 

Propriétés de l’algorithme SAEM : sous certaines hypothèses, 

la suite (θ k ) d’estimateurs fournie par l’algorithme SAEM sur le 

modèle M h converge presque sûrement vers un point θ ∞ , 

maximum (local) de la fonction de vraisemblance g h 

il existe une constante C telle que 

sup |p(y; θ) − p h (y; θ)| ≤ Ch p 

θ∈Θ 


Un schéma de linéarisation locale au sein d’un algo MCMC 

But : optimisation du temps de calcul pour le noyau marche alatoire 

dans l’algorithme MCMC 

Principe 

Évaluation de f(t, φ) pour φ dans un voisinage de φ 0 et avec 

f(t, φ 0 ) déjà évalué par LL 

Développement de Taylor en (t, φ 0 ) de l’ODE 

Pas de calculs d’exponentielles de matrices 

environ 7 fois plus rapide que LL 


Un schéma de linéarisation locale au sein d’un algo MCMC 

But : optimisation du temps de calcul pour le noyau marche alatoire 

dans l’algorithme MCMC 

Principe 

Évaluation de f(t, φ) pour φ dans un voisinage de φ 0 et avec 

f(t, φ 0 ) déjà évalué par LL 

Développement de Taylor en (t, φ 0 ) de l’ODE 

Pas de calculs d’exponentielles de matrices 

environ 7 fois plus rapide que LL 

Convergence 

Sous certaines hypothèses et pour φ proche de φ 0 , on note f h,φ0 

la solution obtenue par cette procédure 

Il existe C 1 et C 2 indépendantes de φ telle que 

sup |f(t, φ) − f h,φ0 (t, φ)| ≤ max(C 1 h 2 , C 2 ‖φ − φ 0 ‖ 2 R 

) k 

t∈[t 0 ,T ] 


Model defined by ordinary differential equations 

Michaelis Menten PK model with one compartment with first order 

absorption: 

Here, φ = (k m , V m , k a , V ) 

dC 



V 

k m + C(t, φ) 


Model defined by ordinary differential equations 

Michaelis Menten PK model with one compartment with first order 

absorption: 

Here, φ = (k m , V m , k a , V ) 

ODE without analytic solution 

dC 



V 

k m + C(t, φ) 


Exemple sur des données simulées 

Exemple de pharmacocinétique avec N=20 patients indépendants, 

π φ (.; β) = N (µ, Ω) 

1.8 

1.6 

1.4 

1.2 

Concentration (g/L) 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 2 4 6 8 10 12 

Temps (heure) 


Résultats d’estimation 

Comparaison de SAEM avec NONMEM 

NONMEM ne converge pas 

SAEM estime correctement les paramètres 

k m V m k a V var km var Vm var ka var V σ 2 

pt init 0.50 0.100 5.00 5.0 0.400 0.400 0.400 0.400 0.1000 

valeur 0.37 0.082 2.72 12.2 0.040 0.040 0.040 0.040 0.0100 

SAEM 0.47 0.088 2.58 12.3 0.043 0.038 0.039 0.043 0.0084 

NONMEM 0.60 0.100 2.57 12.3 10 −8 0.062 0.068 0.036 0.0088 


Modèle défini par un système 

d’équations différentielles 

stochastiques 


Extension aux équations différentielles stochastiques 

Modèle M 

y ij = Z(t ij , φ i ) + ε ij 1 ≤ i ≤ N, 1 ≤ j ≤ n i , 

ε ij ∼ i.i.d N (0, σ 2 ) 

φ i ∼ i.i.d π(.; β) 

La fonction de régression Z du modèle est solution de 

dZ(t, φ) = F (Z(t, φ), t, φ)dt + γdW t , t ∈ [t 0 , T ] 

Z(t 0 , φ) = Z 0 (φ) 

où W t est un mouvement brownien 

Données complètes du modèle : (y, Z, φ) 


Algorithme de Hastings-Metropolis 

Algorithme de H-M, à l’itération l 

1. Simulation de φ l sous p(.|Z (l−1) , y) 

2. Simulation de Z l sous p(.|φ l , y) 


Algorithme de Hastings-Metropolis 

Algorithme de H-M, à l’itération l 

1. Simulation de φ l sous p(.|Z (l−1) , y) 

2. Simulation de Z l sous p(.|φ l , y) 

Calcul de p(Z|φ) dans chaque probabilité d’acceptation 

SDE linéaire : loi de la trajectoire connue 

SDE non linéaire : estimation par Monte-Carlo ou Importance 

Sampling (Pedersen 1995) 

⇒ Pont brownien ?? 


Application aux données de theophylline 

Modèle défini par une ODE : 

dZ(t, φ) = 

( Dose · ka 

V 

e −k at − C l 

V Z(t, φ) ) 

dt 

y ij = Z(t ij , φ i ) + ε ij 

= Dose i ka i ke i 




) 

+ ε ij 



Modèle défini par une ODE : 

dZ(t, φ) = 

( Dose · ka 

V 


V Z(t, φ) ) 

dt 


= Dose i ka i ke i 




) 

+ ε ij 

Modèle défini par une SDE : 

dZ(t, φ) = 

( Dose · ka 

V 


V Z(t, φ) ) 

dt + γdW t 




1 

2 

3 

15 

10 

10 

10 

4 

10 

5 

5 

5 

5 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 

15 

5 

10 

6 

10 

7 

10 

8 

10 

5 

5 

5 

5 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 

10 

9 

15 

10 

10 

11 

10 

12 

10 

5 

5 

5 

5 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 

0 

0 10 20 



15 

10 

5 

0 

0 10 20 30 

Time (hr) 

10 

5 

0 

−5 

0 10 20 30 

Time (hr) 

10 

5 

0 

−5 

0 10 20 30 

Time (hr) 

10 

5 

0 

−5 

0 10 20 30 

Time (hr) 

Concentration (mg/L) 




15 

10 

5 

0 

−5 

0 10 20 30 

Time (hr) 

8 

6 

4 

2 

0 

−2 

0 10 20 30 

Time (hr) 

8 

6 

4 

2 

0 

−2 

0 10 20 30 

Time (hr) 

8 

6 

4 

2 

0 

−2 

0 10 20 30 

Time (hr) 





10 

8 

6 

4 

2 

0 

0 10 20 30 

Time (hr) 

15 

10 

5 

0 

0 10 20 30 

Time (hr) 

8 

6 

4 

2 

0 

−2 

0 10 20 30 

Time (hr) 

10 

5 

0 

−5 

0 10 20 30 

Time (hr) 






Conclusion 


Conclusion 

Les modèles (non linéaires) à effet mixtes sont en plein essor et 

très utiles dans de nombreux domaines d’application 


Conclusion 



De nouveaux outils statistiques et algorithmiques permettent 

d’aborder et traiter ces modèles 


Conclusion 





Les méthodes MCMC sont des outils algorithmiques très 

performants, non seulement dans un cadre bayésien, mais plus 

généralement en inférence statistique. 


Conclusion 





Les méthodes MCMC sont des outils algorithmiques très 

performants, non seulement dans un cadre bayésien, mais plus 

généralement en inférence statistique. 

La possibilité de considérer des modèles définis par des ODE et 

des SDE est très encourageante 


Bibliographie 

MONOLIX, the Matlab software and the User’s guide, 

http://www.math.u-psud.fr/ lavielle/monolix/logiciels.html 

Kuhn E., Lavielle M., "Maximum likelihood estimation in nonlinear mixed 

effects models", Computational Statistics and Data Analysis (to appear), 

2005 

Kuhn E., Lavielle M., "Coupling a stochastic approximation version of EM 

with a MCMC procedure", ESAIM P&S, vol. 8, pp 115–131, 2004. 

Delyon B., Lavielle M., and Moulines E. , "Convergence of a stochastic 

approximation version of the EM algorithm", The Annals of Stat., vol. 27, 

no. 1, pp 94–128, 1999. 

Pinheiro J. C., Bates, D. M., "Mixed-Effects Models in S and S-PLUS", 

Springer, 2000. 

Davidian M., "Nonlinear Models for Repeated Measurement Data", 

Chapman and Hall, 1995.

Les modÃ¨les non linÃ©aires Ã effets mixtes - Isped

Create successful ePaper yourself

Delete template?

Save as template?