12.07.2015 Views

Ding Wang 2012 Neurocomputing

Ding Wang 2012 Neurocomputing

Ding Wang 2012 Neurocomputing

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

D. <strong>Wang</strong> et al. / <strong>Neurocomputing</strong> 78 (<strong>2012</strong>) 14–22 21The cost function1008060402000 1000 2000 3000 4000 5000Iteration stepsState and reference trajectories1.510.50−0.5r 1 x 1−10 50 100 150 200 250Time stepsState and reference trajectories10.50−0.50 50 100 150 200 250Time stepsr 2 x 28u p1 u p26The tracking control trajectories−2Fig. 7. Simulation results of Example 2.420−40 50 100 150 200 250Time stepsFigs. 3 and 4, where the corresponding reference trajectories arealso plotted to evaluate the tracking performance. The trackingcontrol curves and the tracking errors are shown in Figs. 5 and 6,respectively. Besides, we can derive that the tracking error becomese 19 ¼½0:2778 10 5 20:8793 10 5 Š T after 19 time steps. Thesesimulation results verify the excellent performance of the trackingcontroller developed by the iterative ADP algorithm.5.2. Example 2The second example is obtained from [32]. Consider thenonlinear discrete-time system described as (53) where"f ðx k Þ¼0:2x #1ke x2 2k0:3x 3 ,2k0:2 0gðx k Þ¼:0 0:2The desired trajectory is set to"r k ¼sinðkþ0:5pÞ#:0:5 cos kIn the implementation of the iterative HDP algorithm, theinitial weights and structures of three networks are set the sameas Example 1. Then, for the given initial state x 0 ¼½1:5 1Š T ,wetrain the model network for 10 000 steps using 1000 data samplesunder the learning rate a m ¼ 0:05. Besides, the critic network andthe action network are trained for 5000 iterations so that thegiven error bound e ¼ 10 6 is reached. The learning rate in thetraining process is also a c ¼ a a ¼ 0:05.The convergence process of the cost function of the iterativeHDP algorithm is shown in Fig. 7(a), for k¼0. Then, we apply thetracking control law to the system for 250 time steps and obtainthe state and reference trajectories shown in Fig. 7(b) and (c).Besides, the tracking control curves are given in Fig. 7(d). It isclear from the simulation results that the iterative HDP algorithmproposed in this paper is very effective in solving the finitehorizontracking control problems.6. ConclusionAn effective method is proposed in this paper to design thefinite-horizon near-optimal tracking controller for a class ofdiscrete-time nonlinear systems. The iterative ADP algorithm isintroduced to solve the cost function of the DTHJB equation withconvergence analysis, which obtains a finite-horizon near-optimaltracking controller that makes the cost function close to itsoptimal value within an e-error bound. Three NNs are used toapproximate the cost function, the control law, and the nonlinearsystem, respectively. The simulation examples confirmed thevalidity of the tracking control approach. The strategy presentedin this paper only be true of a class of affine nonlinear systemsand requires complete knowledge of the system dynamics.Though there are many practical systems for which the approachcan be applied, it is necessary to broaden its applicability for amore general class of nonlinear systems. Consequently, our futurework includes studying the optimal tracking control problems fornon-affine nonlinear systems and model-free systems.References[1] L. Cui, H. Zhang, B. Chen, Q. Zhang, Asymptotic tracking control scheme formechanical systems with external disturbances and friction, <strong>Neurocomputing</strong>73 (2010) 1293–1302.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!