28.01.2015 Views

Slides in PDF - of Marcus Hutter

Slides in PDF - of Marcus Hutter

Slides in PDF - of Marcus Hutter

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Marcus</strong> <strong>Hutter</strong> - 99 - Universal Induction & Intelligence<br />

Markov Decision Processes (MDPs)<br />

a computationally tractable class <strong>of</strong> problems<br />

• MDP Assumption: State s t := o t and r t are<br />

probabilistic functions <strong>of</strong> o t−1 and a t−1 only.<br />

• Further Assumption:<br />

State=observation space § is f<strong>in</strong>ite and small.<br />

Example MDP<br />

✞ ✎☞ ✎☞<br />

✝✲ ✍✌<br />

s 1 ✲<br />

✍✌<br />

s 3 r 2<br />

r 1 ✻ ✒<br />

✎☞ ✠ ✎☞ ❄<br />

✛<br />

r 4 ☎<br />

r 3<br />

✍✌<br />

s 4 ✛<br />

✍✌<br />

s 2 ✆<br />

• Goal: Maximize long-term expected reward.<br />

• Learn<strong>in</strong>g: Probability distribution is unknown but can be learned.<br />

• Exploration: Optimal exploration is <strong>in</strong>tractable<br />

but there are polynomial approximations.<br />

• Problem: Real problems are not <strong>of</strong> this simple form.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!