Gradient Descent and the Nelder-Mead Simplex Algorithm

cs.tcd.ie

Gradient Descent and the Nelder-Mead Simplex Algorithm

Optimisation and Search: Gradient Descent and the Nelder-Mead Simplex Algorithm


Why minimise a function numerically? a b f(a,b) y Unknown!


Why minimise a function numerically? Background: linear regression


Why minimise a function numerically? Background: linear regression Straight Line: f(x) = α 1 x + α 2


Why minimise a function numerically? Background: linear regression Straight Line: f(x) = α 1 x + α 2 ε i


Why minimise a function numerically? Background: linear regression Straight Line: f(x) = α 1 x + α 2 Error between f(x i ) given by the model and y i from the data: ε ( α i 1 , α 2) = f ( xi ) = α x 1 i − + α y 2 i − y i ε i


Why minimise a function numerically? Background: linear regression Straight Line: f(x) = α 1 x + α 2 Error between f(x i ) given by the model and y i from the data: ε ( α i 1 , α 2) = f ( xi ) = α x 1 i − + α y 2 i − y i ε i Task: Find the parameters α 1 and α 2 that minimise the sum of squared errors! E( α , α ) 1 2 = N ∑ i= 1 ε ( α , α ) i 1 2 2 = N ∑ i= 1 ( α x 1 i + α 2 − y i ) 2


Why minimise a function numerically? non-linear regression • Linear Regression: ▫ Fitting function is linear with respect to the parameters can be solved analytically (see Wikipedia) • Non-linear Regression: ▫ Fitting function is non-linear with respect to the parameters (e.g. f(x,α 1 ,α 2 ) = sin(α 1 x)+cos(α 2 x)) Often no analytical solution Numerical optimisation or direct search


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent 1. Choose initial parameters α 1 and α 2 2. Calculate the gradient 3. Step in the direction of the gradient with a step size proportional to the amplitude of the gradient you get new parameters α 1 and α 2 4. Check if the parameters have changed at a rate above a certain threshold 5. If yes, go to 2, else terminate


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Gradient Descent: Example E(α 1 ,α 2 ) α 1 α2


Nelder-Mead Simplex Algorithm (for functions of 2 variables) 1. Pick 3 parameter combinations Triangle 2. Evaluate the function for those combinations f h ,f s ,f l : highest, second highest and lowest point 3. Update the triangle using the best of the transformations in the figure 4. Check for end condition 5. Go to 2 or terminate


f l Nelder-Mead Algorithm: Update Rules ≤ f r accept < f r f s f r < f l f ≤ f < s r f h no improvement fr ≥ f h


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2


Nelder-Mead Algorithm: Example E(α 1 ,α 2 ) α 1 α 2

More magazines by this user
Similar magazines