Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
parameters. By refining P(H|HP, opponent), we are learning the opponent’s inclination <strong>for</strong> particular<br />
types of tactics according to what is available to them, or <strong>for</strong> us the effectiveness of our<br />
attack types choices.<br />
6.7.2 Possible improvements<br />
There are several main research directions <strong>for</strong> possible improvements:<br />
• improving the underlying heuristics: the heuristics presented here are quite simple but<br />
they may be changed, <strong>and</strong> even removed or added, <strong>for</strong> another RTS or FPS, or <strong>for</strong> more<br />
per<strong>for</strong>mance. In particular, our “defense against invisible” heuristic could take detector<br />
positioning/coverage into account. Our heuristic on tactical values can also be reworked<br />
to take terrain tactical values into account (chokes <strong>and</strong> elevation in StarCraft). We now<br />
detail two particular improvements which could increase the per<strong>for</strong>mance significantly:<br />
– To estimate T1:n (tactical value of the defender) when we attack, or T A1:n (tactical<br />
values of the attacker) when we defend, is the most tricky of all because it may<br />
be changing fast. For that we use a units filter which just decays probability mass<br />
of seen units. An improvement would be to use a particle filter Weber et al. [2011],<br />
additionally with a learned motion model, or a filtering model adapted to (<strong>and</strong> taking<br />
advantage of) regions as presented in section 8.1.5.<br />
– By looking at Table 6.2, we can see that our consistently bad prediction across types<br />
is <strong>for</strong> air attacks. It is an attack type <strong>for</strong> which is particularly important to predict<br />
to have ranged units (which can attack flying units) to defend, <strong>and</strong> because the<br />
positioning is so quick (air units are more mobile). Perhaps we did not have enough<br />
data, as our model fares well in ZvZ <strong>for</strong> which we have much more air attacks, but<br />
they may also be more stereotyped. Clearly, our heuristics are missing in<strong>for</strong>mation<br />
about air defense positioning <strong>and</strong> coverage of the territory (this is a downside of region<br />
discretization). Air raids work by trying to exploit fine weaknesses in static defense,<br />
<strong>and</strong> they are not restrained (as ground attacks) to pass through concentrating chokes.<br />
• improving the dynamic of the model: there is room to improve the dynamics of the model:<br />
considering the prior probabilities to attack in regions given past attacks <strong>and</strong>/or considering<br />
evolutions of the T ,T A,B,E values (derivatives) in time.<br />
• The discretization that we used may show its limits, though if we want to use continuous<br />
values, we need to setup a more complicated learning <strong>and</strong> inference process (Monte-Carlo<br />
Markov chain (MCMC)* sampling).<br />
• improving the model itself: finally, one of the strongest assumptions (which is a drawback<br />
particularly <strong>for</strong> prediction) of our model is that the attacking player is always<br />
considered to attack in this most probable regions. While this would be true if the<br />
model was complete (with finer army positions inputs <strong>and</strong> a model of what the player<br />
thinks), we believe such an assumption of completeness is far fetched. Instead we should<br />
express that incompleteness in the model itself <strong>and</strong> have a “player decision” variable<br />
D ∼ <strong>Multi</strong>nomial(P(A1:n, H1:n), player).<br />
113