Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta
Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta
Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Research<br />
Pioneered and made repeated contributions to reinforcement learning, an approach to<br />
artificial and natural intelligence that emphasizes learning and planning from sample<br />
experience. Currently seeking to extend reinforcement learning ideas to an empirically<br />
grounded approach to knowledge representation.<br />
Most significant contributions:<br />
• The theory <strong>of</strong> temporal-difference learning and the TD(λ) algorithm (1988).<br />
• The standard textbook for reinforcement learning (with Barto, 1998)<br />
• The actor-critic (policy gradient) class <strong>of</strong> algorithms (1984, 2000).<br />
• The Dyna architecture integrating learning, planning and reacting (1990)<br />
• Temporal-difference models <strong>of</strong> animal learning (with Barto, 1981, 1990)<br />
• The “options” framework for temporal abstraction (with Precup, Singh, 1999)<br />
• Predictive state representations (with Littman, Singh, 2002)<br />
• Algorithms for online step-size adaptation (1981, 1992)<br />
• Temporal-difference networks for grounded knowledge representation (2005, 2006)<br />
• Gradient temporal-difference algorithms (with Maei, Szepesvari, 2008–)<br />
Selected Grants<br />
NSERC Collaborative Research and Development Grant, with Nortel Networks and Bell<br />
Canada, “Learning and Prediction in High-dimensional Stochastic Domains,” September<br />
2006 – August 2009, funded at CAD$186,523. One <strong>of</strong> five principal investigators.<br />
iCORE Chair and Pr<strong>of</strong>essorship Establishment Grant, “Reinforcement Learning and<br />
Artificial Intelligence,” September 1, 2003 – August 31, 2008, funded at CAD$3,000,000.<br />
Principal investigator. Renewed until August 2013 at an additional CAD$2,750,000.<br />
<strong>Alberta</strong> Ingenuity Centre Grant, “<strong>Alberta</strong> Ingenuity Centre for Machine Learning,” April<br />
2003 – March 2008, funded at CAD$9,887,600. One <strong>of</strong> eight principal investigators.<br />
Renewed until March 2009 at CAD$2,000,000. Renewed again in 2009 for another five<br />
years at CAD$10,000,000.<br />
NSERC Discovery Grant, “Reinforcement Learning and Artificial Intelligence,” April, 2004<br />
– March 2009, funded at CAD$250,000. Principal investigator. Renewed in 2009 for a<br />
second five years at an additional CAD$190,000.<br />
Air Force Office <strong>of</strong> Scientific Research to the <strong>University</strong> <strong>of</strong> Massachusetts, “Stochastic<br />
Scheduling and Planning Using Reinforcement Learning,” AFOSR Grant Number<br />
F49620-96-1-0254, June 1, 1996 – May 31, 2000, funded at USD$446,570. Coprincipal<br />
investigator with A. Barto.<br />
2