The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits
The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits
The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Minimizing Regret<br />
• No explore regret = T<br />
• All exploit regret = T<br />
• Some minimum between those points<br />
Regret<br />
T<br />
Regret<br />
T<br />
Regret<br />
T<br />
T<br />
n<br />
T<br />
n<br />
T<br />
n