Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Bayesian Programming and Learning for Multi-Player Video Games ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
• Obj i∈�1...n� ∈ {T rue, F alse}: adequacy of direction i with the objective (given by a higher<br />
rank model). In our StarCraft AI, we use the scalar product between the direction i<br />
<strong>and</strong> the objective vector (output of the pathfinding) with a minimum value (0.3 in move<br />
mode <strong>for</strong> instance) so that the probability to go in a given direction is proportional to its<br />
alignment with the objective.<br />
– For flee(), the objective is set in the direction which flees the potential damage gradient<br />
(corresponding to the unit type: ground or air).<br />
– For fightMove(), the objective is set (by the units group) either to retreat, to fight<br />
freely or to march aggressively towards the goal (see 5.3.3).<br />
– For the scout behavior, the objective (�o) is set to the next pathfinder waypoint.<br />
• Dmg i∈�1...n� ∈ {no, low, medium, high}: potential damage value in direction i, relative to<br />
the unit base health points, in direction i. In our StarCraft AI, this is directly drawn from<br />
two constantly updated potential damage maps (air, ground).<br />
• A i∈�1...n� ∈ {free, small, big}: occupation of the direction i by an allied unit. The model<br />
can effectively use many values (other than “occupied/free”) because directions may be<br />
multi-scale (<strong>for</strong> instance we indexed the scale on the size of the unit) <strong>and</strong>, in the end,<br />
small <strong>and</strong>/or fast units have a much smaller footprint, collision wise, than big <strong>and</strong>/or slow.<br />
In our AI, instead of direct positions of allied units, we used their (linear) interpolation<br />
dist(unit, � di)<br />
unit.speed (time it takes the unit to go to � di) frames later (to avoid squeezing/expansion).<br />
• E i∈�1...n� ∈ {free, small, big}: occupation of the direction i by an enemy unit. As above.<br />
• Occ i∈�1...n� ∈ {free, building, staticterrain}: Occupied, repulsive effect of buildings <strong>and</strong><br />
terrain (cliffs, water, walls).<br />
There is basically one set of (sensory) variables per perception in addition to the Diri values.<br />
Decomposition<br />
The joint distribution (JD) over these variables is a specific kind of fusion called inverse programming<br />
[Le Hy et al., 2004]. The sensory variables are considered conditionally independent<br />
knowing the actions, contrary to st<strong>and</strong>ard naive <strong>Bayesian</strong> fusion, in which the sensory variables<br />
are considered independent knowing the phenomenon.<br />
P(Dir1:n, Obj1:n, Dmg1:n, A1:n, E1:n, Occ1:n) (5.1)<br />
= JD =<br />
n�<br />
P(Diri)P(Obji|Diri) (5.2)<br />
i=1<br />
P(Dmgi|Diri)P(Ai|Diri) (5.3)<br />
P(Ei|Diri)P(Occi|Diri) (5.4)<br />
We assume that the i directions are independent depending on the action because dependency<br />
is already encoded in (all) sensory inputs.<br />
80