Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
• AD1:n ∈ {no, low, med, high}: same <strong>for</strong> air defense.<br />
• ID1:n ∈ {no detector, one detector, several}: invisible defense, equating to numbers of<br />
detectors.<br />
• T T ∈ [∅, building1, building2, building1 ∧ building2, techtrees, . . . ]: all the possible technological<br />
trees <strong>for</strong> the given race. For instance {pylon, gate} <strong>and</strong> {pylon, gate, core} are<br />
two different T ech T rees, see chapter 7.<br />
• HP ∈ {ground, ground∧air, ground∧invis, ground∧air∧invis, ground∧drop, ground∧<br />
air∧drop, ground∧invis∧drop, ground∧air∧invis∧drop}: how possible types of attacks,<br />
directly mapped from T T in<strong>for</strong>mation. This variable serves the purpose of extracting all<br />
that we need to know from T T <strong>and</strong> thus reducing the complexity of a part of the model<br />
from n mappings from T T to Hi to one mapping from T T to HP <strong>and</strong> n mapping from<br />
HP to Hi. Without this variable, learning the co-occurrences of T T <strong>and</strong> Hi is sparse<br />
in the dataset. In prediction, with this variable, we make use of what we can infer on<br />
the opponent’s strategy Synnaeve <strong>and</strong> Bessière [2011b], Synnaeve <strong>and</strong> Bessière [2011], in<br />
decision-making, we know our own possibilities (we know our tech tree as well as the units<br />
we own).<br />
We can consider a more complex version of this tactical model taking soft evidences into account<br />
(variables on which we have a probability distribution), which is presented in appendix B.2.1.<br />
Decomposition<br />
=<br />
P(A1:n, E1:n, T1:n, T A1:n, B1:n, (6.1)<br />
H1:n, GD1:n, AD1:n, ID1:n, HP, T T ) (6.2)<br />
n�<br />
[P(Ai)P(Ei, Ti, T Ai, Bi|Ai) (6.3)<br />
i=1<br />
P(ADi, GDi, IDi|Hi)P(Hi|HP )] P(HP |T T )P(T T ) (6.4)<br />
This decomposition is also shown in Figure 6.4. We can see that we have in fact two models:<br />
one <strong>for</strong> A1:n <strong>and</strong> one <strong>for</strong> H1:n.<br />
Forms <strong>and</strong> learning<br />
We will explain the <strong>for</strong>ms <strong>for</strong> a given/fixed i region number:<br />
• P(A) is the prior on the fact that the player attacks in this region, in our evaluation we<br />
set it to nbattles/(nbattles + nnot battles).<br />
• P(E, T, T A, B|A) is a co-occurrences table of the economical, tactical (both <strong>for</strong> the defender<br />
<strong>and</strong> the attacker), belonging scores where an attacks happen. We just use Laplace’s<br />
law of succession (“add one” smoothing) Jaynes [2003] <strong>and</strong> count the co-occurrences in<br />
the games of the dataset (see section 6.4), thus almost per<strong>for</strong>ming maximum likelihood<br />
learning of the table.<br />
P(E = e, T = t, T A = ta, B = b|A = T rue) =<br />
102<br />
1 + nbattles(e, t, ta, b)<br />
|E| |T | |T A| |B| + �<br />
E,T,T A,B nbattles(E, T, T A, B)