Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

More documents

Recommendations

Info

2.2 THE SPSA ALGORITHM RECURSIONS 1st-SPSA [17]: θ ˆ = θˆ − a gˆ ( θˆ ), 0,1,2,... (2.1) k + 1 k k k k k = 2nd-SPSA [18]: ˆ ˆ −1 θ = θ − a H gˆ ( ˆ ), H = f ( H ) (2.2 a) k + 1 k k k k θ k k k k = k 1 H H ˆ 1 + H , = 0,1,2,... k + 1 − k + 1 k (2.2 b) k k k where a k and a k are the scalar gain series that satisfy certain SA conditions [18], ĝ k is the SP estimate of the loss function gradient that depends on the gain sequence c k (representing a difference interval of the perturbations), Hˆ k is the SP estimate of the Hessian matrix, and f k maps the usual non-positive-definite H k to a positive-definite pxp matrix. The two recursions are showed in Fig. 2.1. Let ∆ k be a user-generated mean zero random vector of dimension p with its components being independent random variables. Fig. 2.1. The two-recursions in 2nd-SPSA algorithm (solid-line eq. 2.2 a, dashed-line eq. 2.2 b). The i-th element of the loss function gradient is given by [18]. ( gˆ ) = (2c ∆ k i k ki −1 ) [ y( ˆ θ + c ∆ ) − y( ˆ θ −c ∆ )], i=1, 2, … , p (2.3) k k k k k k 21
CHAPTER 2. PROPOSED SPSA ALGORITHM where ∆ ki is the i-th component of the k ∆ vector and y(θ) is the measurements of the loss function: y(θ ) = L(θ ) + (noise) (2.4) * where θ is the parameter that has the true value of θ . It is noted that the 2nd-SPSA form is a special case of the general adaptive SP method. The general method can also be used in root-finding problems where H k represents an estimate of the associated Jacobian matrix. The true Hessian matrix of the loss function H (θ ) H ij i j has its i-th element defined as 2 = ∂ L / ∂θ ∂θ and its value at the solution ( * * H θ ) denote by H . Finally, its estimation and ijth element of estimate of H is defined in Sec. 2.6 using the Fisher information matrix (FIM). The FIM is used here in stead Hessian matrix in order to estimate this matrix efficiently [22]. The FIM is obtained by Monte Carlo Newton-Raphson (MCNR)[23]. However, this Hessian matrix estimate is convenient in an optimization application and is a crucial requirement for the new mapping f k proposed in the following section. 2.3 -Proposed Mapping An important point of implementing 2nd-SPSA is to define the mapping f k , from H k to H k since the former is often non-positive definite in practice. It is noted that there are no simple and universal conditions that guarantee a matrix to be positively definite. The existence of a minimum(s) for a loss function based on the problem’s physical nature guarantees that its Hessian should be positively definite. The following approach eliminates the non-positive definiteness of H and using the Fisher information matrix, we can keep this condition in this k matrix even when the real application has a computational complexity is very high. Now, this approach is motivated by finite-sample concerns, as we discuss below. First, we compute the eigenvalues of H k and sort them into descending order: Λ k ≡ diag , λ , , λ , λ , λ ,..., λ ] [ λ 1 2 q −1 q q+ 1 p K (2.5) 22
Page 1 and 2: Approximation of Hessian Matrix for
Page 3 and 4: Copyright 2009 by Jorge Ivan Medina
Page 5 and 6: ここで提案するアルゴ
Page 7 and 8: ABSTRACT shown that for the same as
Page 9 and 10: Contents 1. Introduction 1 1.1 Moti
Page 11 and 12: CONTENTS 5.3 Parameter Estimation b
Page 13 and 14: LIST OF FIGURES Fig. 4.1 Block diag
Page 15 and 16: List of Abbreviations SPSA 1st-SPSA
Page 17 and 18: CHAPTER 1. INTRODUCTION the converg
Page 19 and 20: CHAPTER 1.INTRODUCTION approximatio
Page 21 and 22: CHAPTER 1. INTRODUCTION and simulta
Page 23 and 24: CHAPTER 1. INTRODUCTION Typical app
Page 25 and 26: CHAPTER 1. INTRODUCTION 1.4--Featur
Page 27 and 28: CHAPTER 1. INTRODUCTION Some of the
Page 29 and 30: CHAPTER 1. INTRODUCTION M − k ( k
Page 31 and 32: CHAPTER 1. INTRODUCTION usually, a
Page 33 and 34: CHAPTER 1. INTRODUCTION Main Disadv
Page 35: CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 39 and 40: CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 87 and 88:
CHAPTER 2. PROPOSED SPSA ALGORITHM
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
CHAPTER 3. APPLICATION USING M2-SPS
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
CHAPTER 6. CONCLUSIONS AND FUTURE W
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
REFERENCE [10] S. A. Billings, G. N
Page 149 and 150:
REFERENCE [29] M. Metivier and P. P
Page 151 and 152:
REFERENCE [51] D. Parikh, N. Ahmed
Page 153 and 154:
REFERENCE [72] N. J. Gordon, D. J S
Page 155 and 156:
APPENDIX A that this random vector
Page 157 and 158:
APPENDIX A Part 2: To show that ~
Page 159 and 160:
APPENDIX A Proof of Theorem 2a (M2-
Page 161 and 162:
APPENDIX A 1 ⎡~ ~ ~ ~ ( ( ) ( ))
Page 163 and 164:
APPENDIX A ˆ θ * −α * k+ 1 −
Page 165 and 166:
APPENDIX A results. Here, zk+n+ 1 i
Page 167 and 168:
APPENDIX A Because the second eleme
Page 169 and 170:
154 APPENDIX A
Page 171 and 172:
APPENDIX B The Wei [48] approach is
Page 173 and 174:
158 APPENDIX B
Page 175 and 176:
LIST OF THE PUBLICATIONS AND INTERN
Page 177 and 178:
LIST OF THE PUBLICATIONS AND INTERN
Page 179:
Author Biography Jorge Ivan Medina
show all

Approximation of Hessian Matrix for Second-order SPSA Algorithm ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?