Chapter 2 Introduction to Neural network

More documents

Recommendations

Info

¡¢£¤¥¦ ¦ ¢ Chapter 6 The backpropagation algorithm Learning of a single neuron. Consider the following model of a neuron ¢w x+θ ¤¥¦§¤¨©¥©§©¨ ¡¢£ θ is a scalar parameter called the ”bias”. This bias gives the neuron the possibility of shifting the function f(·) to the left or right (positive or negative bias) θ Example: f(x + θ) = sign(x + θ) ¡ ¢§¤¥ ¢ ¦θ=¤ We can use the same terminology as before by defining the extended input vector ¯x = [x θ=£¤ 1 x 2 · · · x n 1] T and the extended weight vector ¯w = [w 1 w 2 · · · w n θ] T ⇒ y = f( ¯w T ¯x) 41
Assume that we have a sequence of input vectors and a corresponding sequence of target (desired) scalars (¯x 1 , t 1 ), (¯x 2 , t 2 ), , · · · , (¯x N , t N ) We wish to find the weights of a neuron with a non-linear function f(·) so that we can minimize the squared difference between the output y n and the target t n , i.e. min E n = min 1 2 (t n − y n ) 2 = min 1 2 e2 n , n = 1, 2, · · · , N We will use the steepest descent approach ¯w (n+1) = ¯w (n) − α∇ w E We need to find ∇ w E! 1 ∇ w E n = ∇ w 2 (t n − f( ¯w T ¯x } {{ n ) } u } {{ } f(u) } {{ } h(f) ) 2 } {{ } g(h) The chain-rule gives ∇ w g(h(f(u))) = ∇ h g∇ f h∇ u f∇ w u ∇ h g = e n ∇ f h = t n − f(u) = −1 ∇ u f = depends on the nonlinearity f(·) we choose ∇ w u = ¯x n ⇒ ¯w (n+1) = ¯w (n) + αe n ∇ u f¯x n since f(·) is a function f : R → R, i.e. one-dimensional. We can write ∇ u f = df du ⇒ The neuron learning rule for general function f(·) is ¯w (n+1) = ¯w (n) + αe n df du¯x n Where u = ¯w T ¯x and α is the stepsize. OBS! If f(u) = u, that is a linear function with slope 1, the above algorithm will become the LMS alg. for a linear neuron. ( df du = 1 ) 42
Page 1 and 2: Chapter 2 Introduction to Neural ne
Page 3 and 4: ¡¢£¤¡¢£¥ ¡ ¢£¦ § ¡¢
Page 5 and 6: 2.5 Learning Depending on the task
Page 7 and 8: 3.1 Hadamard-Walsh transform For bi
Page 9 and 10: ¢£¤¥¦§ £ ¤¥¦¨ ¢¥¢¦
Page 11 and 12: ¡ ¢£ ¤¡¤¢¤£¥¦w x 4.2 The
Page 13: □ A formula for R(m, n) ∑n−1
Page 17 and 18: 6.1.2 The unipolar sigmoid function
Page 19 and 20: ¯W [3] = (rand(p, k + 1) − 0.5)
Page 21 and 22: where u [3] j = p∑ i=1 w [3] ji o
Page 23 and 24: 6.4 The rule of the hidden layer Co
Page 25 and 26: ¤ A ¤ W ¢£ Since we know that f
Page 27 and 28: The backpropagation algorithm can c
Page 29 and 30: x - training data o - generalizatio
Page 31 and 32: 6.13.3 Cross validation A standard

Chapter 2 Introduction to Neural network

Create successful ePaper yourself

Delete template?

Save as template?