Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

More documents

Recommendations

Info

1.5 APPLICATIONS AREAS 6. While “basic” SPSA uses only objective function measurements to carry out the iteration process in a stochastic analogue of the steepest descent method of deterministic optimization. 1.5 -Applications Areas Over the past several years, non-linear models have been increasingly used for simulation, state estimation and control purposes. Particularly, the rapid progresses in computational techniques and the success of non-linear model predictive control have been strong incentivites for the development of such models as neural networks or first-principle models. Process modeling requires the estimation of several unknown parameters from noisy measurements data. A least square or maximum likelihood cost function. is usually minimized using a gradient-based optimization method [7]. Several techniques for computing the gradient of the cost function are available, including finite difference approximations and analytic differentiation. In these techniques, the computational expense required to estimate the current gradient direction is directly proportional to the number of unknown model parameters, which becomes an issue for model involving a large number of parameters. This is typically the case in neural networks modeling, but can also occur in other circumstances, such as the estimation of parameters and initial conditions in first principle models. Moreover the derivation of sensitivity equations requires analytic manipulation of the model equation, which is time consuming and subject to errors [7]. In contrast to standard finite differences which approximate the gradient by varying the parameters one at time, the simultaneous perturbation approximation of the gradient proposed by Spall and Chin [12] make use of a very efficient technique based on a simultaneous (random) perturbation in all the parameters and on each iteration the SPSA only needs few loss measurements to estimate the gradient, regardless of the dimensionality of the problem (number of parameters)[12]. Hence, one gradient evaluation requires only two evaluations of the cost function. This approach has first been applied to gradient estimation in a first-order stochastic approximation algorithm, and more recently to Hessian estimation in an accelerated second-order SPSA algorithm. Therefore, using those features, the proposed SPSA algorithm in this dissertation also will be applied to non-linear systems regardless of the dimensionality of the problem. 11
CHAPTER 1. INTRODUCTION Some of the general areas for application of SPSA include statistical parameter estimation, simulation-based optimization, pattern recognition, non-linear regression, signal processing, neural network (NN) training, adaptive feedback control, and experimental design. Specific system applications represented in the list of references include [14]. 1. Adaptive optics 2. Aircraft modeling and control 3. Atmosferic and planetary modeling 4. Fault detection in plant operations 5. Human-machine interface control 6. Industrial quality improvement 7. Medical imaging 8. Noise cancellation 9. Process control 10. Quering network design 11. Robot control 12. Parameter estimation in highly non-linear model In this point, the research has one important goal because this application (parameter estimation) is very useful in realistic systems. Often it is necessary to estimate the parameters of a model of unknown system. Various techniques exist to accomplish this task, including LMS algorithms, and the L-M algorithm [15]. These techniques require an analytic form of the gradient of the function of the parameters to be estimated. A key feature of the SPSA method is that it is a gradient-free optimization technique. The function of parameters to be identified is highly non-linear and of sufficient difficulty that obtaining an analytic form of the gradient is empirical. 1.6 -Formulation of SPSA Algorithm The problem of minimizing a (scalar) differentiable loss function L (θ ), where θ ∈R P , p ≥1 is considered. A typical example of L (θ ) would be some measure of mean-square error (MSE) for the output of a process as a function of some design parameters θ . For many cases of practical interest, this is equivalent to finding the minimizing * θ such that 12
Page 1 and 2: Approximation of Hessian Matrix for
Page 3 and 4: Copyright 2009 by Jorge Ivan Medina
Page 5 and 6: ここで提案するアルゴ
Page 7 and 8: ABSTRACT shown that for the same as
Page 9 and 10: Contents 1. Introduction 1 1.1 Moti
Page 11 and 12: CONTENTS 5.3 Parameter Estimation b
Page 13 and 14: LIST OF FIGURES Fig. 4.1 Block diag
Page 15 and 16: List of Abbreviations SPSA 1st-SPSA
Page 17 and 18: CHAPTER 1. INTRODUCTION the converg
Page 19 and 20: CHAPTER 1.INTRODUCTION approximatio
Page 21 and 22: CHAPTER 1. INTRODUCTION and simulta
Page 23 and 24: CHAPTER 1. INTRODUCTION Typical app
Page 25: CHAPTER 1. INTRODUCTION 1.4--Featur
Page 29 and 30: CHAPTER 1. INTRODUCTION M − k ( k
Page 31 and 32: CHAPTER 1. INTRODUCTION usually, a
Page 33 and 34: CHAPTER 1. INTRODUCTION Main Disadv
Page 35 and 36: CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 77 and 78:
CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
CHAPTER 3. APPLICATION USING M2-SPS
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
CHAPTER 6. CONCLUSIONS AND FUTURE W
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
REFERENCE [10] S. A. Billings, G. N
Page 149 and 150:
REFERENCE [29] M. Metivier and P. P
Page 151 and 152:
REFERENCE [51] D. Parikh, N. Ahmed
Page 153 and 154:
REFERENCE [72] N. J. Gordon, D. J S
Page 155 and 156:
APPENDIX A that this random vector
Page 157 and 158:
APPENDIX A Part 2: To show that ~
Page 159 and 160:
APPENDIX A Proof of Theorem 2a (M2-
Page 161 and 162:
APPENDIX A 1 ⎡~ ~ ~ ~ ( ( ) ( ))
Page 163 and 164:
APPENDIX A ˆ θ * −α * k+ 1 −
Page 165 and 166:
APPENDIX A results. Here, zk+n+ 1 i
Page 167 and 168:
APPENDIX A Because the second eleme
Page 169 and 170:
154 APPENDIX A
Page 171 and 172:
APPENDIX B The Wei [48] approach is
Page 173 and 174:
158 APPENDIX B
Page 175 and 176:
LIST OF THE PUBLICATIONS AND INTERN
Page 177 and 178:
LIST OF THE PUBLICATIONS AND INTERN
Page 179:
Author Biography Jorge Ivan Medina
show all

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?