Chapter 2 Introduction to Neural network

More documents

Recommendations

Info

6.2 The backpropagation algorithm for a 3-layer network [1] ©¥¡ [2] ¡[1]¢[1]¥[1] ¥£¡ ¡[1]¢[1]¥[1] ¡[2]¢[2]©[2] ¡[2]¢[2]©[2] ¨©¡¡[3]¢[3]¨[3] [3] • ¯w [i] - is the extended weight matrix. The last column consists of bias terms. ¤ ¡¤¢¤¥¦¢ § ¡§¢§¨ ¦¡¦©¨¡ ¢ ¡¢£ ¡¢¨¡¢¨ Assume we have a set of input vectors ¯x n and a set of corresponding target vectors t n , n = 1, 2, · · · , N (a training set) (x 1 , t 1 ), (x 2 , t 2 ), · · · , (x N , t N ) We want to minimize the squared difference between the output y n of the network and the target vectors t n i.e. min P∑ i=1 e 2 i We first state the algorithm and then we prove it. 6.2.1 Backpropagation algorithm Step 1. Initiate all the weight-matrices. This can be done by the following rule. Pick values randomly in the interval (−0.5, 0.5) and divide with fan-in, which is the number of units feeding the layer. Example: ¯W [1] = ¯W [2] = (rand(m, n + 1) − 0.5) n + 1 (rand(k, m + 1) − 0.5) m + 1 45
¯W [3] = (rand(p, k + 1) − 0.5) k + 1 where ”rand” is uniformly distributed in [0, 1]. Step 2. Pick a input-target pair randomly from the training set. say (x i , t i ). Calculate the output when x i is the input according to [ [ y i = H ¯W [3] · G[ ¯W [2] · F ¯W [1]¯x ]] ] i where F = [f 1 (u 1 ) [1] , f 2 (u 2 ) [1] , · · · , f m (u m ) [1] ] T G = [g 1 (u 1 ) [2] , g 2 (u 2 ) [2] , · · · , g k (u k ) [2] ] T H = [h 1 (u 1 ) [3] , h 2 (u 2 ) [3] , · · · , h p (u p ) [3] ] T All functions are chosen in advance. Step 3. Find the weight corrections for each layer. First define the vector e i = t i − y i . Define the local error vector δ [s] , s = 1, 2, 3 δ [3] = diag(e) ∂H ∂u [3] △ ¯W [3] = αδ [3] (ō [2] ) T , α − stepsize OBS! ō [2] is the extended vector! ( (W δ [2] [3] = diag ) ) T δ [3] ∂G OBS! W [3] is without the biases! △ ¯W [2] = αδ [2] (ō [1] ) T ∂u [2] ( (W δ [1] [2] = diag ) ) T δ [2] ∂F ∂u [1] △ ¯W [1] = αδ [1]¯x T 46
Page 1 and 2: Chapter 2 Introduction to Neural ne
Page 3 and 4: ¡¢£¤¡¢£¥ ¡ ¢£¦ § ¡¢
Page 5 and 6: 2.5 Learning Depending on the task
Page 7 and 8: 3.1 Hadamard-Walsh transform For bi
Page 9 and 10: ¢£¤¥¦§ £ ¤¥¦¨ ¢¥¢¦
Page 11 and 12: ¡ ¢£ ¤¡¤¢¤£¥¦w x 4.2 The
Page 13 and 14: □ A formula for R(m, n) ∑n−1
Page 15 and 16: Assume that we have a sequence of i
Page 17: 6.1.2 The unipolar sigmoid function
Page 21 and 22: where u [3] j = p∑ i=1 w [3] ji o
Page 23 and 24: 6.4 The rule of the hidden layer Co
Page 25 and 26: ¤ A ¤ W ¢£ Since we know that f
Page 27 and 28: The backpropagation algorithm can c
Page 29 and 30: x - training data o - generalizatio
Page 31 and 32: 6.13.3 Cross validation A standard

Chapter 2 Introduction to Neural network

Create successful ePaper yourself

Delete template?

Save as template?