20.03.2015 Views

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Epoch</strong>-<strong>Greedy</strong> <strong>Algorithm</strong><br />

• Do a single step of exploration<br />

– Begin creating an unbiased vector of inputs to<br />

create the hypotheses<br />

– Observe context in<strong>for</strong>mation<br />

• Add the learned in<strong>for</strong>mation to past<br />

exploration and create a new hypotheses<br />

– This uses the contextual data and exploration<br />

• For a set number of steps exploit the<br />

hypotheses arm

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!