17.01.2015 Views

Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta

Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta

Richard S. Sutton - Webdocs Cs Ualberta - University of Alberta

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Research<br />

Pioneered and made repeated contributions to reinforcement learning, an approach to<br />

artificial and natural intelligence that emphasizes learning and planning from sample<br />

experience. Currently seeking to extend reinforcement learning ideas to an empirically<br />

grounded approach to knowledge representation.<br />

Most significant contributions:<br />

• The theory <strong>of</strong> temporal-difference learning and the TD(λ) algorithm (1988).<br />

• The standard textbook for reinforcement learning (with Barto, 1998)<br />

• The actor-critic (policy gradient) class <strong>of</strong> algorithms (1984, 2000).<br />

• The Dyna architecture integrating learning, planning and reacting (1990)<br />

• Temporal-difference models <strong>of</strong> animal learning (with Barto, 1981, 1990)<br />

• The “options” framework for temporal abstraction (with Precup, Singh, 1999)<br />

• Predictive state representations (with Littman, Singh, 2002)<br />

• Algorithms for online step-size adaptation (1981, 1992)<br />

• Temporal-difference networks for grounded knowledge representation (2005, 2006)<br />

• Gradient temporal-difference algorithms (with Maei, Szepesvari, 2008–)<br />

Selected Grants<br />

NSERC Collaborative Research and Development Grant, with Nortel Networks and Bell<br />

Canada, “Learning and Prediction in High-dimensional Stochastic Domains,” September<br />

2006 – August 2009, funded at CAD$186,523. One <strong>of</strong> five principal investigators.<br />

iCORE Chair and Pr<strong>of</strong>essorship Establishment Grant, “Reinforcement Learning and<br />

Artificial Intelligence,” September 1, 2003 – August 31, 2008, funded at CAD$3,000,000.<br />

Principal investigator. Renewed until August 2013 at an additional CAD$2,750,000.<br />

<strong>Alberta</strong> Ingenuity Centre Grant, “<strong>Alberta</strong> Ingenuity Centre for Machine Learning,” April<br />

2003 – March 2008, funded at CAD$9,887,600. One <strong>of</strong> eight principal investigators.<br />

Renewed until March 2009 at CAD$2,000,000. Renewed again in 2009 for another five<br />

years at CAD$10,000,000.<br />

NSERC Discovery Grant, “Reinforcement Learning and Artificial Intelligence,” April, 2004<br />

– March 2009, funded at CAD$250,000. Principal investigator. Renewed in 2009 for a<br />

second five years at an additional CAD$190,000.<br />

Air Force Office <strong>of</strong> Scientific Research to the <strong>University</strong> <strong>of</strong> Massachusetts, “Stochastic<br />

Scheduling and Planning Using Reinforcement Learning,” AFOSR Grant Number<br />

F49620-96-1-0254, June 1, 1996 – May 31, 2000, funded at USD$446,570. Coprincipal<br />

investigator with A. Barto.<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!