15.12.2012 Views

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.6.3 Possible uses<br />

We recall that we used this model <strong>for</strong> opening prediction, as a proxy <strong>for</strong> timing attacks <strong>and</strong><br />

aggressiveness. It can also be used:<br />

• <strong>for</strong> build tree suggestion when wanting to achieve a particular opening. Particularly one<br />

does not have to encode all the openings into a finite state machine: simply train this<br />

model <strong>and</strong> then ask P(BT |time, opening, λ = 1) to have a distribution on the build trees<br />

that generally are used to achieve this opening.<br />

• as a commentary assistant AI. In the StarCraft <strong>and</strong> StarCraft 2 communities, there are<br />

a lot of progamers tournaments that are commented <strong>and</strong> we could provide a tool <strong>for</strong><br />

commentators to estimate the probabilities of different openings or technology paths. As<br />

in commented poker matches, where the probabilities of different h<strong>and</strong>s are drawn on<br />

screen <strong>for</strong> the spectators, we could display the probabilities of openings. In such a setup<br />

we could use more features as the observers <strong>and</strong> commentators can see everything that<br />

happens (upgrades, units) <strong>and</strong> we limited ourselves to “key” buildings in the work presented<br />

here.<br />

Possible improvements<br />

First, our prediction model can be upgraded to explicitly store transitions between t <strong>and</strong> t+1 (or<br />

t − 1 <strong>and</strong> t) <strong>for</strong> openings (Op) <strong>and</strong> <strong>for</strong> build trees* (BT ). The problem is that P(BT t+1 |BT t )<br />

will be very sparse, so to efficiently learn something (instead of a sparse probability table)<br />

we have to consider a smoothing over the values of BT , perhaps with the distances mentioned<br />

in section 7.5.2. If we can learn P(BT t+1 |BT t ), it would perhaps increase the results of<br />

P(Opening|Observations), <strong>and</strong> it almost surely would increase P(BuildT ree t+1 |Observations),<br />

which is important <strong>for</strong> late game predictions.<br />

Incorporating P(Op t−1 ) priors per match-up (from Table 7.1) would lead to better results,<br />

but it would seem like overfitting to us: particularly because we train our robot on games played<br />

by humans whereas we have to play against robots in competitions.<br />

Clearly, some match-ups are h<strong>and</strong>led better, either in the replays labeling part <strong>and</strong>/or in<br />

the prediction part. Replays could be labeled by humans <strong>and</strong> we would do supervised learning<br />

then. Or they could be labeled by a combination of rules (as in [Weber <strong>and</strong> Mateas, 2009])<br />

<strong>and</strong> statistical analysis (as the method presented here). Finally, the replays could be labeled<br />

by match-up dependent openings (as there are different openings usages by match-ups*, see<br />

Table 7.1), instead of race dependent openings currently. The labels could show either the two<br />

parts of the opening (early <strong>and</strong> late developments) or the game time at which the label is the<br />

most relevant, as openings are often bimodal (“fast exp<strong>and</strong> into mutas”, “corsairs into reaver”,<br />

etc.).<br />

Finally, a hard problem is detecting the “fake” builds of very highly skilled players. Indeed,<br />

some progamers have build orders which purpose are to fool the opponent into thinking that they<br />

are per<strong>for</strong>ming opening A while they are doing B. For instance they could “take early gas” leading<br />

the opponent to think they are going to do tech units, not gather gas <strong>and</strong> per<strong>for</strong>m an early rush<br />

instead. We think that this can be h<strong>and</strong>led by our model by changing P(Opening|LastOpening)<br />

145

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!