Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

More documents

Recommendations

Info

1.8 VERSIONS OF SPSA ALGORITHM the Hessian matrix of L ˆ θ ). Hence, equation (1.10) is a stochastic analogue of the ( k well-known Newton-Raphson algorithm of deterministic optimization. Since gˆ ( ˆ θ ) has a known form, the parallel recursions in equations (1.9) and (1.10) can be implemented once that k k Ĝk is specified. The SP gradient approximation requires two measurements of L( ⋅): y ( + ) k and y . These represent measurements at design levels θˆ k + ck∆k and θˆ k − ck∆k respectively, where (−) k c k is a positive scalar and ∆ k represents a user-generated random vector satisfying certain regularity conditions, e.g ∆ k being a vector of independent Bernoulli ± 1 random variables satisfies these conditions but a vector of uniformly distributed random variables does not. The SP comes from the fact that all elements of θˆ k are perturbed simultaneously in forming gˆ ( ˆ θ ) , as opposed to the finite difference form, where they are perturbed one at time. To k k perform one iteration of (1.9) and (1.10), one additional measurement, say (0) y k is required; this measurement represents an observation of L (⋅) at the nominal design level θˆ k . Main Advantage: - 1st-SPSA gives region(s) where the function value is low, and this allows to conjecture in which region(s) is the global solution. - 2nd-SPSA is based on a highly efficient approximation of the gradient based on loss function measurements. In particular, on each iteration the SPSA only needs three loss measurements to estimate the gradient, regardless of the dimensionality of the problem. Moreover, the 2nd-SPSA is grounded on a solid mathematical framework that permits to assess its stochastic properties also for optimization problems affected by noise or uncertainties. Due to these striking advantages, 2nd-SPSA is recently used as optimization engine for adaptive control problems. 17
CHAPTER 1. INTRODUCTION Main Disadvantages: - 1st-SPSA gives slow convergence. - 2nd-SPSA does not take into account equality/inequality constraints. The 1st-SPSA and 2nd-SPSA are algorithms that do not depend on derivative information, and it is able to find a good approximation to the solution using few function values. Its disadvantage is that once obtained a good approximation, it may not satisfy some conditions and constraints associated with some complex problems [17][18]. Also, in both version of SPSA algorithm is not possible guarantee that non-positive definiteness part of the Hessian matrix can be eliminated when the number of parameters to be adjusted is large. This can cause instability in the system and also both versions can become very high in computational cost. Finally, in the 1st-SPSA and 2nd-SPSA algorithms, the error for the loss function with an ill-conditioned Hessian is greater than the one with a well-conditioned Hessian, with this problem the system performance decrease. Also, in estimating optimum parameters of a model or time series, there are several factors which must be considered when deciding on the appropriate optimization technique. Among these factors are convergence speed, accuracy, algorithm suitability, complexity and computational cost in terms of time (coding, run –time, output) and power. In the parameter estimation application, the 2nd-SPSA had problems with convergence to local minima and computational cost. So that, in [18] are proposed some techniques, in order to solve this kind of problems efficiently. Nevertheless, when the number of parameters to be adjusted is very large the convergence is slow and instable. The techniques defined in [18] included a mapping in the Hessian matrix, but this is not consistent in some conditions or applications. Therefore, according to these disadvantages (theoretical and practical), in the following chapter, we have proposed some improvements to speed up and stability in the 2nd-SPSA algorithm, in particular, in the stability, convergence, and computational cost. Also, it is suggested a new mapping for implementing in 2nd-SPSA that eliminates the non-positive definiteness while preserving key spectral properties of the estimated Hessian. This Hessian is estimated using the Fisher information matrix in order to keep it non-positive definiteness and improve the stability. So that, those improvements constitute our proposed SPSA algorithm that it is described in the following chapter. 18
Page 1 and 2: Approximation of Hessian Matrix for
Page 3 and 4: Copyright 2009 by Jorge Ivan Medina
Page 5 and 6: ここで提案するアルゴ
Page 7 and 8: ABSTRACT shown that for the same as
Page 9 and 10: Contents 1. Introduction 1 1.1 Moti
Page 11 and 12: CONTENTS 5.3 Parameter Estimation b
Page 13 and 14: LIST OF FIGURES Fig. 4.1 Block diag
Page 15 and 16: List of Abbreviations SPSA 1st-SPSA
Page 17 and 18: CHAPTER 1. INTRODUCTION the converg
Page 19 and 20: CHAPTER 1.INTRODUCTION approximatio
Page 21 and 22: CHAPTER 1. INTRODUCTION and simulta
Page 23 and 24: CHAPTER 1. INTRODUCTION Typical app
Page 25 and 26: CHAPTER 1. INTRODUCTION 1.4--Featur
Page 27 and 28: CHAPTER 1. INTRODUCTION Some of the
Page 29 and 30: CHAPTER 1. INTRODUCTION M − k ( k
Page 31: CHAPTER 1. INTRODUCTION usually, a
Page 35 and 36: CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 83 and 84:
CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
CHAPTER 3. APPLICATION USING M2-SPS
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
CHAPTER 6. CONCLUSIONS AND FUTURE W
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
REFERENCE [10] S. A. Billings, G. N
Page 149 and 150:
REFERENCE [29] M. Metivier and P. P
Page 151 and 152:
REFERENCE [51] D. Parikh, N. Ahmed
Page 153 and 154:
REFERENCE [72] N. J. Gordon, D. J S
Page 155 and 156:
APPENDIX A that this random vector
Page 157 and 158:
APPENDIX A Part 2: To show that ~
Page 159 and 160:
APPENDIX A Proof of Theorem 2a (M2-
Page 161 and 162:
APPENDIX A 1 ⎡~ ~ ~ ~ ( ( ) ( ))
Page 163 and 164:
APPENDIX A ˆ θ * −α * k+ 1 −
Page 165 and 166:
APPENDIX A results. Here, zk+n+ 1 i
Page 167 and 168:
APPENDIX A Because the second eleme
Page 169 and 170:
154 APPENDIX A
Page 171 and 172:
APPENDIX B The Wei [48] approach is
Page 173 and 174:
158 APPENDIX B
Page 175 and 176:
LIST OF THE PUBLICATIONS AND INTERN
Page 177 and 178:
LIST OF THE PUBLICATIONS AND INTERN
Page 179:
Author Biography Jorge Ivan Medina
show all

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?