Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
7.6.3 Possible uses<br />
We recall that we used this model <strong>for</strong> opening prediction, as a proxy <strong>for</strong> timing attacks <strong>and</strong><br />
aggressiveness. It can also be used:<br />
• <strong>for</strong> build tree suggestion when wanting to achieve a particular opening. Particularly one<br />
does not have to encode all the openings into a finite state machine: simply train this<br />
model <strong>and</strong> then ask P(BT |time, opening, λ = 1) to have a distribution on the build trees<br />
that generally are used to achieve this opening.<br />
• as a commentary assistant AI. In the StarCraft <strong>and</strong> StarCraft 2 communities, there are<br />
a lot of progamers tournaments that are commented <strong>and</strong> we could provide a tool <strong>for</strong><br />
commentators to estimate the probabilities of different openings or technology paths. As<br />
in commented poker matches, where the probabilities of different h<strong>and</strong>s are drawn on<br />
screen <strong>for</strong> the spectators, we could display the probabilities of openings. In such a setup<br />
we could use more features as the observers <strong>and</strong> commentators can see everything that<br />
happens (upgrades, units) <strong>and</strong> we limited ourselves to “key” buildings in the work presented<br />
here.<br />
Possible improvements<br />
First, our prediction model can be upgraded to explicitly store transitions between t <strong>and</strong> t+1 (or<br />
t − 1 <strong>and</strong> t) <strong>for</strong> openings (Op) <strong>and</strong> <strong>for</strong> build trees* (BT ). The problem is that P(BT t+1 |BT t )<br />
will be very sparse, so to efficiently learn something (instead of a sparse probability table)<br />
we have to consider a smoothing over the values of BT , perhaps with the distances mentioned<br />
in section 7.5.2. If we can learn P(BT t+1 |BT t ), it would perhaps increase the results of<br />
P(Opening|Observations), <strong>and</strong> it almost surely would increase P(BuildT ree t+1 |Observations),<br />
which is important <strong>for</strong> late game predictions.<br />
Incorporating P(Op t−1 ) priors per match-up (from Table 7.1) would lead to better results,<br />
but it would seem like overfitting to us: particularly because we train our robot on games played<br />
by humans whereas we have to play against robots in competitions.<br />
Clearly, some match-ups are h<strong>and</strong>led better, either in the replays labeling part <strong>and</strong>/or in<br />
the prediction part. Replays could be labeled by humans <strong>and</strong> we would do supervised learning<br />
then. Or they could be labeled by a combination of rules (as in [Weber <strong>and</strong> Mateas, 2009])<br />
<strong>and</strong> statistical analysis (as the method presented here). Finally, the replays could be labeled<br />
by match-up dependent openings (as there are different openings usages by match-ups*, see<br />
Table 7.1), instead of race dependent openings currently. The labels could show either the two<br />
parts of the opening (early <strong>and</strong> late developments) or the game time at which the label is the<br />
most relevant, as openings are often bimodal (“fast exp<strong>and</strong> into mutas”, “corsairs into reaver”,<br />
etc.).<br />
Finally, a hard problem is detecting the “fake” builds of very highly skilled players. Indeed,<br />
some progamers have build orders which purpose are to fool the opponent into thinking that they<br />
are per<strong>for</strong>ming opening A while they are doing B. For instance they could “take early gas” leading<br />
the opponent to think they are going to do tech units, not gather gas <strong>and</strong> per<strong>for</strong>m an early rush<br />
instead. We think that this can be h<strong>and</strong>led by our model by changing P(Opening|LastOpening)<br />
145