10.07.2015 Views

Online Learning, Regret Minimization, and Game Theory - MLSS 08

Online Learning, Regret Minimization, and Game Theory - MLSS 08

Online Learning, Regret Minimization, and Game Theory - MLSS 08

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

History <strong>and</strong> development (abridged) [Hannan’57, Blackwell’56]: Alg. with regret O((N/T) 1/2 ). Re-phrasing, need only T = O(N/ε 2 ) steps to get time-average regret down to ε. (will call this quantity T ε ) Optimal dependence on T (or ε). . <strong>Game</strong>-theoristsviewed #rows N as constant, not so important as T, sopretty much done.Why optimal in T?dest•Say world flips fair coin each day.•Any alg, in T days, has expected cost T/2.•But E[min(# heads,#tails)] = T/2 – O(T 1/2 ).•So, per-day gap is O(1/T 1/2 ).AlgorithmWorld – life - fate1 00 1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!