Approximation of Hessian Matrix for Second-order SPSA Algorithm ...
Approximation of Hessian Matrix for Second-order SPSA Algorithm ...
Approximation of Hessian Matrix for Second-order SPSA Algorithm ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
1.5 APPLICATIONS AREAS<br />
6. While “basic” <strong>SPSA</strong> uses only objective function measurements to carry out the iteration<br />
process in a stochastic analogue <strong>of</strong> the steepest descent method <strong>of</strong> deterministic<br />
optimization.<br />
1.5 -Applications Areas<br />
Over the past several years, non-linear models have been increasingly used <strong>for</strong> simulation, state<br />
estimation and control purposes. Particularly, the rapid progresses in computational techniques<br />
and the success <strong>of</strong> non-linear model predictive control have been strong incentivites <strong>for</strong> the<br />
development <strong>of</strong> such models as neural networks or first-principle models. Process modeling<br />
requires the estimation <strong>of</strong> several unknown parameters from noisy measurements data. A least<br />
square or maximum likelihood cost function. is usually minimized using a gradient-based<br />
optimization method [7]. Several techniques <strong>for</strong> computing the gradient <strong>of</strong> the cost function are<br />
available, including finite difference approximations and analytic differentiation. In these<br />
techniques, the computational expense required to estimate the current gradient direction is<br />
directly proportional to the number <strong>of</strong> unknown model parameters, which becomes an issue <strong>for</strong><br />
model involving a large number <strong>of</strong> parameters. This is typically the case in neural networks<br />
modeling, but can also occur in other circumstances, such as the estimation <strong>of</strong> parameters and<br />
initial conditions in first principle models. Moreover the derivation <strong>of</strong> sensitivity equations<br />
requires analytic manipulation <strong>of</strong> the model equation, which is time consuming and subject to<br />
errors [7].<br />
In contrast to standard finite differences which approximate the gradient by varying the<br />
parameters one at time, the simultaneous perturbation approximation <strong>of</strong> the gradient proposed<br />
by Spall and Chin [12] make use <strong>of</strong> a very efficient technique based on a simultaneous (random)<br />
perturbation in all the parameters and on each iteration the <strong>SPSA</strong> only needs few loss<br />
measurements to estimate the gradient, regardless <strong>of</strong> the dimensionality <strong>of</strong> the problem (number<br />
<strong>of</strong> parameters)[12]. Hence, one gradient evaluation requires only two evaluations <strong>of</strong> the cost<br />
function. This approach has first been applied to gradient estimation in a first-<strong>order</strong> stochastic<br />
approximation algorithm, and more recently to <strong>Hessian</strong> estimation in an accelerated<br />
second-<strong>order</strong> <strong>SPSA</strong> algorithm. There<strong>for</strong>e, using those features, the proposed <strong>SPSA</strong> algorithm in<br />
this dissertation also will be applied to non-linear systems regardless <strong>of</strong> the dimensionality <strong>of</strong><br />
the problem.<br />
11