12.07.2015 Views

A State-Space Representation Model and Learning Algorithm for ...

A State-Space Representation Model and Learning Algorithm for ...

A State-Space Representation Model and Learning Algorithm for ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Phi [deg]43210-1-2-3Simulation of the Controlled Pendulum with POD-40 10 20 30 40 50time [sec]Figure 3. Simulation of the system after learning the balancecontrol policy with POD <strong>for</strong> different initial conditions.Phi [deg]3210-1-2Simulation of the Controlled Pendulum with POD-30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8time [sec]Figure 4. Simulation of the system after learning the balancecontrol policy with POD <strong>for</strong> different initial conditions (zoomin).Failures800700600500400300200100Number of Failures of POD in derivingthe balance control policy00 0.5 1 1.5 2Iterationsx 10 4Figure 5. Number of failures until POD derives the balancecontrol policy.4. CASE STUDY TWO4.1 Vehicle Cruise ControlThe POD real-time learning method introduced in theprevious sections is now applied to a vehicle cruise-controlproblem. Cruise control automatically regulates the vehicle’slongitudinal velocity by suitably adjusting the gas pedalposition. A vehicle cruise-control system is activated by thedriver who desires to maintain a constant speed in longhighway driving. The driver activates the cruise controllerwhile driving at a particular speed, which is then recorded asthe desired or set-point speed to be maintained by thecontroller. The main goal in designing a cruise controlalgorithm is to maintain vehicle speed smoothly but accurately,even under large variation of plant parameters (e.g., thevehicle’s varying mass in terms of the number of passengers)<strong>and</strong> road grade. In the case of passenger cars, however, vehiclemass may change noticeably but is within a small range.There<strong>for</strong>e, powertrain behavior might not vary significantly.The objective of the POD learning cruise controller is torealize in real time the control policy (gas pedal position) thatmaintains the vehicle speed as set by the driver under a greatrange of different road grades. Implementing learning vehiclecruise controllers has been addressed previously employinglearning <strong>and</strong> active control approaches. Zhang et al. [36]implemented learning control based on pattern recognition toregulate in real time the parameters of an PID cruise controller.Shahdi et al. [37] proposed an active learning method to extractthe driver's behavior <strong>and</strong> to derive control rules <strong>for</strong> a cruisecontrol system. However, no attempt has been reported inimplementing a learning automotive vehicle cruise controllerutilizing the principle of rein<strong>for</strong>cement learning, i.e., enablingthe controller to improve its per<strong>for</strong>mance over time by learningfrom its own failures through a rein<strong>for</strong>cement signal from theexternal environment, <strong>and</strong> thus, attempting to improve futureper<strong>for</strong>mance.The software package enDYNA by TESIS [38], suitable <strong>for</strong>real-time simulation of internal combustion engines, is used toevaluate the per<strong>for</strong>mance of the POD learning cruise controller.The software simulates the longitudinal vehicle dynamics witha highly variable drive train including the modules of starter,brake, clutch, converter, <strong>and</strong> transmission. In the driving modethe engine is operated by means of the usual vehicle controlelements just as a driver would do. In addition, a mechanicalparking lock <strong>and</strong> the uphill grade can be set. The driver modelis designed to operate the vehicle at given speed profiles(driving cycles). It actuates the starter, accelerator, clutch <strong>and</strong>brake pedals according to the profile specification, <strong>and</strong> alsoshifts gears. In this example, an existing vehicle model isselected representing a midsize passenger car carrying an 1.9Lturbocharged diesel engine.When activated, the learning cruise controller bypasses thedriver model <strong>and</strong> takes over the vehicle’s cruising. The Markovstates are defined to be the pair of the transmission gear <strong>and</strong> the7 Copyright © 2007 by ASME

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!