Truncated Gauss-Newton Method with Trust-region: an efficient ... - IFP
Truncated Gauss-Newton Method with Trust-region: an efficient ... - IFP
Truncated Gauss-Newton Method with Trust-region: an efficient ... - IFP
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Truncated</strong> <strong>Gauss</strong>-<strong>Newton</strong> <strong>Method</strong> <strong>with</strong><br />
<strong>Trust</strong>-<strong>region</strong>:<br />
<strong>an</strong> <strong>efficient</strong> solver for the non-linear<br />
tomographic inverse problem<br />
Frédéric Delbos<br />
Je<strong>an</strong>-Charles Gilbert<br />
Delphine Sinoquet
Objective<br />
Where is my model solution ?<br />
Develop a fast, accurate, robust <strong>an</strong>d automatic<br />
solver for the tomographic inverse problem<br />
jerry
Outline<br />
Preconditioned Conjugate Gradient Revisited<br />
<strong>Trust</strong>-Region method - <strong>an</strong> alternative to line search<br />
Singular model <strong>an</strong>alysis
Terminology<br />
Line search ambiguity:<br />
The optimization method<br />
The <strong>Gauss</strong>-<strong>Newton</strong> step failure<br />
<strong>Gauss</strong>-<strong>Newton</strong>: GN ; index k<br />
<strong>Trust</strong>-Region:T-R<br />
Line Search:LS<br />
Preconditioned Conjugate Gradient: PCG ; index i
Optimization method - GN<br />
Non-linear cost function f<br />
Linearization of T around<br />
<br />
mk<br />
Quadratic model F m of<br />
k<br />
f m <br />
m<br />
k
Optimization method - GN<br />
Simplified expression of the quadratic function<br />
f k<br />
Value of in<br />
Gradient of in<br />
Approximated Hessi<strong>an</strong><br />
(neglect second-order derivatives of traveltimes)<br />
m<br />
m<br />
f k
Optimization method - GN
Preconditioning CG - Motivations<br />
Solve the ill-conditioned linear problem:<br />
Speed-up convergence of the CG algorithm
Preconditioning CG - <strong>Method</strong><br />
Tr<strong>an</strong>sform linear system into:<br />
1 1<br />
k k k k<br />
P H m P g<br />
<strong>with</strong> P symmetric positive definite<br />
<strong>an</strong>d P as close as possible to H
Preconditioning CG - <strong>Method</strong><br />
Find a matrix such that<br />
Minimize the quadratic function in new variables<br />
<strong>with</strong><br />
Q<br />
k<br />
P Q Q<br />
k<br />
T<br />
k k<br />
T T 1<br />
mQm, g Q<br />
g , <strong>an</strong>d H = Q H Q<br />
k k k k k k k k
Preconditioning CG - Choice of P<br />
Hessi<strong>an</strong> decomposition into velocity <strong>an</strong>d interface Blocks<br />
<strong>with</strong> the block of H corresponding to velocity<br />
<strong>an</strong>d the block of H corresponding to interface<br />
v<br />
z<br />
Hvi, vi<br />
i<br />
Hz , z<br />
i<br />
i i
Preconditioning CG - Choice of P<br />
Hessi<strong>an</strong> decomposition into a sum of block matrices<br />
With D <strong>an</strong>d L defined as:
Preconditioning CG - Choice of P<br />
D is symmetric positive definite<br />
We use a Cholesky factorization on matrix D:<br />
The chosen preconditioners: (L. Chauvier et al. 2000)<br />
Jacobi<br />
D <br />
L L<br />
k D<br />
T<br />
D<br />
Symmetric <strong>Gauss</strong>-Seidel<br />
k k
Preconditioning CG - example<br />
Number of CG iterations over GN iterations
Preconditioning CG - example<br />
Relative residuals over CG iterations
Preconditioning CG - example<br />
Non linear cost function over GN iterations
Preconditioning CG<br />
Solve more accurately ill-conditioned problems<br />
Speed-up convergence of the CG algorithm
Outline<br />
Preconditioned Conjugate Gradient Revisited<br />
Preconditioners<br />
CG termination criteria<br />
<strong>Trust</strong>-Region method - <strong>an</strong> alternative to line search<br />
Singular model <strong>an</strong>alysis
CG termination criteria - Choice<br />
Objective: minimize the number of CG iterates <strong>an</strong>d<br />
obtained a sufficient reduction of the non linear cost<br />
function<br />
CG termination criteria:<br />
Maximum number of CG iterations<br />
Relative residual criterion<br />
Final quadratic cost reduction criterion (QCR<br />
experimental)
CG termination criteria - QCR<br />
ni F <br />
F F<br />
, 1<br />
k , k ,<br />
ki i i
CG termination criteria - example<br />
Number of PCG iterations for different eps of<br />
the relative residual criterion
CG termination criteria - example<br />
Non linear cost for different eps of<br />
the relative residual criterion
CG termination criteria - Choice<br />
Objective: minimize the number of CG iterates <strong>an</strong>d<br />
obtained a sufficient reduction of the non linear cost<br />
function<br />
CG termination criteria:<br />
Maximum number of CG iterations<br />
Relative residual criterion<br />
Final quadratic cost reduction criterion (QCR<br />
experimental)
Outline<br />
Preconditioned Conjugate Gradient Revisited<br />
<strong>Trust</strong>-Region method - <strong>an</strong> alternative to line search<br />
Line search<br />
<strong>Trust</strong>-<strong>region</strong><br />
Levenberg-Marquard<br />
Singular model <strong>an</strong>alysis
Minimize F <br />
m :<br />
k<br />
LS method - Theory<br />
<br />
Line Search subproblem to be solved at each GN<br />
iteration <strong>with</strong> a CG algorithm<br />
Accept<strong>an</strong>ce of the step: Armijo criterion
LS method - movie
LS method - example<br />
Non linear cost function over GN iterations<br />
for a relative residual criterion at 1E-15
T-R method - Theory<br />
Minimize F m in a <strong>region</strong> around the current iterate<br />
<br />
k<br />
<br />
<strong>Trust</strong>-<strong>region</strong> radius<br />
K<br />
Adv<strong>an</strong>tage: control the norm of the perturbation<br />
<strong>Trust</strong>-<strong>region</strong> subproblem to be solved at each<br />
GN iteration
T-R method - Theory<br />
Solve the trust-<strong>region</strong> subproblem <strong>with</strong> Steihaug´s<br />
method<br />
* Solve the first order optimal condition <strong>with</strong> a preconditioned<br />
conjugate gradient method (H s.p.d.)<br />
* Ensuring that the computed step stays inside the trust-<strong>region</strong><br />
radius at each CG iteration<br />
** Steihaug´s theorem: the P-norm of the perturbation<br />
increases <strong>with</strong> CG iterates<br />
** Global convergence<br />
Generalized Cauchy point<br />
at least better th<strong>an</strong> the steepest descent method
T-R method - Theory<br />
<br />
Concord<strong>an</strong>ce ratio K between the quadratic model<br />
<strong>an</strong>d the non linear cost function<br />
Numerator actual reduction (non linear cost)<br />
Denominator predicted reduction (quadratic cost)<br />
Acceptation of the step following the value of K<br />
If K<br />
is negative or close to 0 Rejected step :<br />
decrease K<br />
<strong>an</strong>d restart the step<br />
If <br />
is close to 1 Accepted step:<br />
K<br />
m m m<br />
K1K
T-R method - Movie
T-R method - example<br />
Comparison T-R/LS on the non linear<br />
cost function over GN iterations
Levenberg-Marquard - Theory<br />
The Levenberg-Marquard step is defined by the<br />
C<br />
resolution of the following subproblem: <br />
H I mg<br />
Equivalent to solve the trust-<strong>region</strong> minimization<br />
subproblem: (Dennis & Schnabel 1983)<br />
<br />
<br />
<br />
k<br />
min F m<br />
m<br />
<br />
<br />
<br />
n<br />
m<br />
<br />
C<br />
1<br />
C C<br />
<br />
<strong>with</strong> Hk <br />
I gk<br />
; is interpreted as a<br />
Lagr<strong>an</strong>ge multiplier<br />
TR allows <strong>an</strong> eleg<strong>an</strong>t <strong>an</strong>d <strong>efficient</strong> choice of<br />
<br />
C<br />
k k<br />
<br />
C
Outline<br />
Preconditioned Conjugate Gradient Revisited<br />
<strong>Trust</strong>-Region method - <strong>an</strong> alternative to line search<br />
Line search<br />
<strong>Trust</strong>-<strong>region</strong><br />
Levenberg-Marquard<br />
Singular model <strong>an</strong>alysis
Singular model <strong>an</strong>alysis - discussion<br />
H may be positive semi-definite<br />
The reasons:<br />
small regularization weights<br />
ill-posed problem lack of a priori information<br />
The main consequences:<br />
slow or even no convergence of GN method<br />
bad preconditioner: P also semi-positive definite<br />
explosion of the final PCG perturbation in L2 norm<br />
bad direction of the PCG perturbation g m<br />
, <br />
0
Singular model <strong>an</strong>alysis - detection<br />
Detection criteria of singular models<br />
L2 norm of PCG perturbation <br />
PCG perturbation <strong>an</strong>gle <br />
m K 2<br />
g, m<br />
<br />
Preconditioned norm of the generalized Cauchy point:<br />
T 1<br />
C C 1<br />
C g P g<br />
<strong>with</strong> S P g <strong>an</strong>d <br />
T 1 1<br />
g P HP g<br />
1/2 C<br />
PS <br />
2<br />
L-curve find good regularization weight (S. Gomez, 2001)<br />
Plot the parametrized curve:<br />
<br />
* 2 *<br />
<br />
<br />
J m mT <br />
<br />
( mm) R( mm) <br />
<br />
2<br />
<br />
T<br />
*
L-curve - example
Singular model <strong>an</strong>alysis - remedy<br />
Solve a new problem <strong>with</strong> more a priori information<br />
constraint, stronger regularization, <strong>an</strong>d/or new<br />
parameterization<br />
Find a low frequency solution (largest eigenvalues) of<br />
the singular linear problem
Singular model <strong>an</strong>alysis - low frequency<br />
Choice of <strong>Trust</strong>-Region method: adds a priori<br />
information by constraint on the perturbation norm<br />
Work on PCG regularization properties: the best<br />
determined components are computed in the first CG<br />
iterates<br />
Choosing weaker CG preconditioners (tune the CG<br />
velocity convergence)<br />
Modified Cholesky factorization<br />
Convex combination<br />
No preconditioning<br />
T<br />
P P E LL<br />
P 1 I P<br />
P <br />
I<br />
<br />
Choosing stronger CG stopping criteria (stop earlier<br />
in the CG loop)<br />
Criterion on the final reduction of the quadratic cost
Singular model <strong>an</strong>alysis - Weaker<br />
preconditioner<br />
Non linear cost function: comparison of inversion between<br />
the new convex preconditioner <strong>an</strong>d <strong>with</strong>out preconditioning
Singular model <strong>an</strong>alysis - Weaker<br />
preconditioner<br />
Non linear cost function: comparison of inversion between<br />
the new convex preconditioner <strong>an</strong>d <strong>with</strong>out preconditioning
Singular models <strong>an</strong>alysis - QCR<br />
Number of CG iterations: comparison between the final<br />
quadratic cost reduction criterion <strong>an</strong>d the relative residual criterion
Singular models <strong>an</strong>alysis - QCR<br />
Non linear cost function: comparison between the final<br />
quadratic cost reduction criterion <strong>an</strong>d the relative residual criterion
Conclusions<br />
<strong>Trust</strong>-<strong>region</strong> method is:<br />
More robust (solve singular models)<br />
Automatic (choice of trust-<strong>region</strong> parameters)<br />
PCG method:<br />
Relev<strong>an</strong>t PCG termination criterion<br />
More flexibility of the preconditioner<br />
Detect <strong>an</strong>d solve singular model problems
Perspectives<br />
Finalization of the work on unconstrained problems for<br />
singular models<br />
Optimization <strong>with</strong> linear constraints<br />
Augmented Lagr<strong>an</strong>gi<strong>an</strong>: integration of the new<br />
solver<br />
Feasibility study of interior point method<br />
(comparison <strong>with</strong> A.L.)