10.07.2015 Views

Optimal Tracking Control for a Class of Nonlinear Discrete-Time ...

Optimal Tracking Control for a Class of Nonlinear Discrete-Time ...

Optimal Tracking Control for a Class of Nonlinear Discrete-Time ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ZHANG et al.: OPTIMAL TRACKING CONTROL FOR A CLASS OF NONLINEAR DISCRETE-TIME SYSTEMS WITH TIME DELAYS BASED ON HDP 18591.511.51x 2η 2eta 20.50−0.5−1−1.5−1.5 −1 −0.5 0 0.5 1 1.5eta 1State trajectories0.50−0.5−1−1.50 10 20 30 40 50 60 70 80 90 100<strong>Time</strong> stepFig. 2.Hénon chaos orbits.Fig. 4. State variable trajectory x 2 and desired trajectory η 2 .State trajectories1.510.50−0.5−1−1.50 10 20 30 40 50 60 70 80 90 100<strong>Time</strong> stepx 1η 1Per<strong>for</strong>mance index function0.450.40.350.30.250.20.150.10.0500 5 10 15 20 25 30 35 40 45 50Iteration stepFig. 3. State variable trajectory x 1 and desired trajectory η 1 .Fig. 5.Convergence <strong>of</strong> per<strong>for</strong>mance index.A. Example 1Consider the following nonlinear time delay system whichis the example in [54] and [37] with modification:wherex(k + 1) = f (x(k), x(k − 1), x(k − 2))+ g(x(k), x(k − 1), x(k − 2))u(k)x(k) = ε 1 (k), −2 ≤ k ≤ 0 (76)f (x(k), x(k − 1), x(k − 2))[ 0.2x1 (k) exp (x=2 (k)) 2 ]x 2 (k − 2)0.3 (x 2 (k)) 2 x 1 (k − 1)and[ ]−0.2 0g(x(k), x(k − 1), x(k − 2)) =.0 −0.2The desired signal is well-known Hénon mapping as follows:[ ]η1 (k+1)η 2 (k+1)[1+b×η2 (k)−a×η1 2(k)η 1 (k)[ ]η1 (0)η 2 (0)=](77)where a = 1.4, b = 0.3,orbits are given as Fig. 2.Based on the implementation <strong>of</strong> proposed HDP algorithmin Section III-C, we first give the initial states as ε 1 (−2) =ε 1 (−1) = ε 1 (0) = [ 0.5 −0.5 ] T , and the initial control policyas β(k) = −2x(k). The implementation <strong>of</strong> the algorithmis at the time instant k = 3. The maximal iteration step= [ 0.1−0.1]. The chaotic signalError trajectories1.510.50−0.5−1−1.5−20 10 20 30 40 50 60 70 80 90 100<strong>Time</strong> stepFig. 6. <strong>Tracking</strong> error trajectories e 1 and e 2 .i max is 50. We choose three-layer BP neural networks asthe critic network and the action network with the structure2 − 8 − 1 and 6− 8 − 2, respectively. The iteration time<strong>of</strong> the weights updating <strong>for</strong> two neural networks is 100.The initial weights are chosen randomly from [−0.1, 0.1],and the learning rate is α a = α c = 0.05. We selectQ = R = I 2 .The state trajectories are given as Figs. 3 and 4. The solidlines in the two figures are the system states, and the dashedlines are the trajectories <strong>of</strong> the desired chaotic signal. Fromthe two figures, we can see that the state trajectories <strong>of</strong> (76)follow the chaotic signal. From Lemma 2, Theorems 1 and 3,the per<strong>for</strong>mance index function sequence is bounded ande 1e 2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!