25.02.2015 Views

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1.5 APPLICATIONS AREAS<br />

6. While “basic” <strong>SPSA</strong> uses only objective function measurements to carry out the iteration<br />

process in a stochastic analogue <strong>of</strong> the steepest descent method <strong>of</strong> deterministic<br />

optimization.<br />

1.5 -Applications Areas<br />

Over the past several years, non-linear models have been increasingly used <strong>for</strong> simulation, state<br />

estimation and control purposes. Particularly, the rapid progresses in computational techniques<br />

and the success <strong>of</strong> non-linear model predictive control have been strong incentivites <strong>for</strong> the<br />

development <strong>of</strong> such models as neural networks or first-principle models. Process modeling<br />

requires the estimation <strong>of</strong> several unknown parameters from noisy measurements data. A least<br />

square or maximum likelihood cost function. is usually minimized using a gradient-based<br />

optimization method [7]. Several techniques <strong>for</strong> computing the gradient <strong>of</strong> the cost function are<br />

available, including finite difference approximations and analytic differentiation. In these<br />

techniques, the computational expense required to estimate the current gradient direction is<br />

directly proportional to the number <strong>of</strong> unknown model parameters, which becomes an issue <strong>for</strong><br />

model involving a large number <strong>of</strong> parameters. This is typically the case in neural networks<br />

modeling, but can also occur in other circumstances, such as the estimation <strong>of</strong> parameters and<br />

initial conditions in first principle models. Moreover the derivation <strong>of</strong> sensitivity equations<br />

requires analytic manipulation <strong>of</strong> the model equation, which is time consuming and subject to<br />

errors [7].<br />

In contrast to standard finite differences which approximate the gradient by varying the<br />

parameters one at time, the simultaneous perturbation approximation <strong>of</strong> the gradient proposed<br />

by Spall and Chin [12] make use <strong>of</strong> a very efficient technique based on a simultaneous (random)<br />

perturbation in all the parameters and on each iteration the <strong>SPSA</strong> only needs few loss<br />

measurements to estimate the gradient, regardless <strong>of</strong> the dimensionality <strong>of</strong> the problem (number<br />

<strong>of</strong> parameters)[12]. Hence, one gradient evaluation requires only two evaluations <strong>of</strong> the cost<br />

function. This approach has first been applied to gradient estimation in a first-<strong>order</strong> stochastic<br />

approximation algorithm, and more recently to <strong>Hessian</strong> estimation in an accelerated<br />

second-<strong>order</strong> <strong>SPSA</strong> algorithm. There<strong>for</strong>e, using those features, the proposed <strong>SPSA</strong> algorithm in<br />

this dissertation also will be applied to non-linear systems regardless <strong>of</strong> the dimensionality <strong>of</strong><br />

the problem.<br />

11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!