Slides in PDF - of Marcus Hutter

More documents

Recommendations

Info

Marcus Hutter - 100 - Universal Induction & Intelligence Map Real Problem to MDP Map history h t := o 1 a 1 r 1 ...o t−1 to state s t := Φ(h t ), for example: Games: Full-information with static opponent: Φ(h t ) = o t . Classical physics: Position+velocity of objects = position at two time-slices: s t = Φ(h t ) = o t o t−1 is (2nd order) Markov. I.i.d. processes of unknown probability (e.g. clinical trials ≃ Bandits), Frequency of obs. Φ(h n ) = ( ∑ n t=1 δ o t o) o∈O is sufficient statistic. Identity: Φ(h) = h is always sufficient, but not learnable. Find/Learn Map Automatically Φ best := arg min Φ Cost(Φ|h t ) • What is the best map/MDP (i.e. what is the right Cost criterion) • Is the best MDP good enough (i.e. is reduction always possible) • How to find the map Φ (i.e. minimize Cost) efficiently
Marcus Hutter - 101 - Universal Induction & Intelligence ΦMDP: Computational Flow ✓ ✏ ✓ ✏ Transition Pr. ˆT exploration Reward est. ✒ ˆR ✲ bonus ˆT e , ˆR e ✑ ✒ ✑ ✓ ✒ frequency estimate ❅ Bellman ✏ ✓ ❅ ❅❘ Feature Vec. ˆΦ ( ˆQ) ˆValue ✒ ✑ ✒ ✻ Cost(Φ|h) minimization implicit ✓ ✏ ✓ ❄ History h Best Policy ˆp ✒ ✑ ✒ ✻ reward r observation o action a ❄ Environment ✏ ✑ ✏ ✑
Page 1 and 2:
Foundations of Universal Induction
Page 3 and 4:
Marcus Hutter - 3 - Universal Induc
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Marcus Hutter - 11 - Universal Indu
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50: Marcus Hutter - 49 - Universal Indu
Page 91 and 92: Language Tree (Re)construction base
Page 99: Marcus Hutter - 99 - Universal Indu
Page 103 and 104: Marcus Hutter - 103 - Universal Ind
Page 111: Marcus Hutter - 111 - Universal Ind
show all

Slides in PDF - of Marcus Hutter

Create successful ePaper yourself

Delete template?

Save as template?