Slides in PDF - of Marcus Hutter
Slides in PDF - of Marcus Hutter
Slides in PDF - of Marcus Hutter
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Marcus</strong> <strong>Hutter</strong> - 82 - Universal Induction & Intelligence<br />
Universal Rational Agents: Summary<br />
• Setup: Agents act<strong>in</strong>g <strong>in</strong> general probabilistic environments with<br />
re<strong>in</strong>forcement feedback.<br />
• Assumptions: True environment µ belongs to a known class <strong>of</strong><br />
environments M, but is otherwise unknown.<br />
• Results: The Bayes-optimal policy p ξ based on the Bayes-mixture<br />
ξ = ∑ ν∈M w νν is Pareto-optimal and self-optimiz<strong>in</strong>g if M admits<br />
self-optimiz<strong>in</strong>g policies.<br />
• Application: The class <strong>of</strong> ergodic mdps admits self-optimiz<strong>in</strong>g<br />
policies.<br />
• New: Policy p ξ with unbounded effective horizon is the first purely<br />
Bayesian self-optimiz<strong>in</strong>g consistent policy for ergodic mdps.<br />
• Learn: The comb<strong>in</strong>ed conditions Γ k < ∞ and γ k+1<br />
γ k<br />
→ 1 allow a<br />
consistent self-optimiz<strong>in</strong>g Bayes-optimal policy based on mixtures.