Infinite-Horizon Average Reward Markov Decision Processes
Infinite-Horizon Average Reward Markov Decision Processes
Infinite-Horizon Average Reward Markov Decision Processes
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
OutlineThe average rewardClassification of MDPsOptimality equationsValue iteration in unichain modelsPolicy iteration in unichain modelsLinear Programming in unichain modelsDan Zhang, Spring 2012 <strong>Infinite</strong> <strong>Horizon</strong> <strong>Average</strong> <strong>Reward</strong> MDP 2