12.07.2015 Views

s1 s2 s3 s4 s5 - of Marcus Hutter

s1 s2 s3 s4 s5 - of Marcus Hutter

s1 s2 s3 s4 s5 - of Marcus Hutter

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Analysis◮ Key component is bounding |V AˆM (s 1:t ) − V A M (s 1:t )| < ɛ◮ Require each state to be visited sufficiently <strong>of</strong>ten◮ States that are expected to be visited <strong>of</strong>ten needed betterestimatesDiscounted future state distributionV AˆM (s t )−VM A (s t ) = γ ∑ s√= ∑ s√≤Bernstein’s, σ 2 (s) := Var V (s ′ |s) and L := log 1/δErrorw(s) (p s − ˆp s ) · ̂VL|S|w(s)σ 2 (s)m := n(s)/w(s)√≈ L|S| 2≤m≈ ≤ ∑ sw(s)∑s w(s)σ2 (s)L|S| 2Var ∑ ∞k=t γk−t r k ≤ 1(1−γ) 2m(1 − γ) 2Therefore n(s) ≈ w(s)L|S|2ɛ 2 visits to state s needed(1 − γ) 2√|S|σ 2 (s)Ln(s)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!