28.01.2015 Views

Slides in PDF - of Marcus Hutter

Slides in PDF - of Marcus Hutter

Slides in PDF - of Marcus Hutter

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Marcus</strong> <strong>Hutter</strong> - 82 - Universal Induction & Intelligence<br />

Universal Rational Agents: Summary<br />

• Setup: Agents act<strong>in</strong>g <strong>in</strong> general probabilistic environments with<br />

re<strong>in</strong>forcement feedback.<br />

• Assumptions: True environment µ belongs to a known class <strong>of</strong><br />

environments M, but is otherwise unknown.<br />

• Results: The Bayes-optimal policy p ξ based on the Bayes-mixture<br />

ξ = ∑ ν∈M w νν is Pareto-optimal and self-optimiz<strong>in</strong>g if M admits<br />

self-optimiz<strong>in</strong>g policies.<br />

• Application: The class <strong>of</strong> ergodic mdps admits self-optimiz<strong>in</strong>g<br />

policies.<br />

• New: Policy p ξ with unbounded effective horizon is the first purely<br />

Bayesian self-optimiz<strong>in</strong>g consistent policy for ergodic mdps.<br />

• Learn: The comb<strong>in</strong>ed conditions Γ k < ∞ and γ k+1<br />

γ k<br />

→ 1 allow a<br />

consistent self-optimiz<strong>in</strong>g Bayes-optimal policy based on mixtures.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!