12.07.2015 Views

Télécharger le texte intégral - ISPED-Enseignement à distance

Télécharger le texte intégral - ISPED-Enseignement à distance

Télécharger le texte intégral - ISPED-Enseignement à distance

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Annexes 185Estimation of linear mixed models with a mixture of distribution for the random effects 1672.2. LikelihoodFollowing the previous works [3,4], we definew ig the unobserved variab<strong>le</strong> indicating if thesubject i belongs to the component g. We haveP(w ig = 1) = g . The density for the vector y i canthen be written as:f i (y i ) =G∑ g f(y i |w ig = 1) (6)g=1Givenw ig , y i follows a linear mixed model, andthe densityf(y i |w ig = 1) denoted by ig is the multivariateGaussian density with meanE ig and covariancematrix V i given by:E ig =E(Y i |w ig = 1) = X 1iˇ + X 2i ı g + Z 2i g andV i = var(Y i |w ig = 1) = Z i DZ ′ i +2 I ni (7)Let now be the vector of the m parametersof the model. contains with ′ =(ˇ′,(ı g ) ′ g=1,G , ( g) ′ g=1,G , Vec(D)′ , 2 ) and the vectorof the G − 1 first component probabilities( g ) g=1,G−1 . Note that G is entirely determined by as 1 − ∑ G−1g=1 g. Vec(D) represents the vector ofthe upper triangular e<strong>le</strong>ments of D. The estimatesof are obtained as the vector ˆ that maximizesthe observed log-likelihood:⎛⎞N∑ N∑ G∑L(Y; ) = ln(f i (y i )) = ln ⎝ g ig (y i ) ⎠=i=1N∑i=1i=1g=1− n i2 ln(2) − 1 2 ln(|V i|)⎛G∑+ ln ⎝g=1 g e − 1 2 (Y i−E ig ) ′ V −1i2.3. Estimation procedure(Y i −E ig )⎞⎠ (8)We propose to maximize directly the observedlog-likelihood (8) using a modified Marquardt optimizationalgorithm [9], a Newton—Raphson-likealgorithm [10]. The diagonal of the Hessian atiteration k, H (k) , is inflated to obtain a positivedefinite matrix as: H ∗(k) = (H ∗(k)ij) with H ∗(k)ii=H (k)ii+[(1 −)|H (k)ii|+tr(H (k) )] and H ∗(k)ij=H (k)ijif i ≠j. Initial values for and are = 0.01 and = 0.01. They are reduced when H ∗ is positivedefinite and increased if not. The estimates (k) arethen updated to (k+1) using the current modifiedHessian H ∗(k) and the current gradient of theparametersg( (k) ) according to the formula: (k+1) = (k) −˛H ∗(k)−1 g( (k) ) (9)where if necessary, ˛ which equals 1 by defaultis modified to ensure that the log-likelihood isimproved at each iteration.To ensure that the covariance matrix D is positive,we maximize the log-likelihood on the nonzeroe<strong>le</strong>ments of U, the Cho<strong>le</strong>sky factor of D (i.e.U ′ U = D) [7]. Furthermore, to deal with the constraintson (4), we use the transformed parameters( g ) g=1,G−1 with :( )g g = ln(10) GStandard errors of the e<strong>le</strong>ments of D and( g ) g=1,G−1 are computed by the -method [11],whi<strong>le</strong> standard errors of the other parameters aredirectly computed using the inverse of the observedHessian matrix.The convergence is reached when the threefollowing convergence criteria are satisfied:∑ mj=1( (k)j− (k−1)j) 2 ≤ a , |L (k) −L (k−1) |≤ b andg( (k) ) ′ H (k)−1 g( (k) ) ≤ d . The default values are a = 10 −5 , b = 10 −5 and d = 10 −8 .As the log-likelihood of a mixture model mayhave several maxima [8], we use a grid of initial valuesto find the global maximum. The multimodalityof the log-likelihood in mixture models has been oftendiscussed and some authors proposed differentstrategies to choose the set of initial values [12].However, none of them seems to be optimal in ageneral way. We have observed, in our experience,that the results were mainly sensitive to initial valuesof ( g ) g=1,G−1 and ( g ) g=1,G and <strong>le</strong>ss sensitiveto the other parameters (Vec(U), ˇ and) for whichestimates of the homogeneous mixed models weregood initial values.A mixture model is estimated with a fixed numberof components G, otherwise the number of parametersin the model is unknown. To choose theright number of components, one has to estimatemodels with different values for G and se<strong>le</strong>ct thebest model according to a test or a criterion. Someworks favor a bootstrap approach to approximatethe asymptotic distribution of the likelihood ratiotest between models with different number ofcomponents [13] but this approach is very heavyin particular for mixture models with random effects.Criteria such as Akaike’s Information Criterion(AIC) [14] or Bayesian Information Criterion(BIC) [15] are often preferred. We use these se<strong>le</strong>ctioncriteria to se<strong>le</strong>ct the optimal number ofcomponents.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!