15.12.2012 Views

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

• Obj i∈�1...n� ∈ {T rue, F alse}: adequacy of direction i with the objective (given by a higher<br />

rank model). In our StarCraft AI, we use the scalar product between the direction i<br />

<strong>and</strong> the objective vector (output of the pathfinding) with a minimum value (0.3 in move<br />

mode <strong>for</strong> instance) so that the probability to go in a given direction is proportional to its<br />

alignment with the objective.<br />

– For flee(), the objective is set in the direction which flees the potential damage gradient<br />

(corresponding to the unit type: ground or air).<br />

– For fightMove(), the objective is set (by the units group) either to retreat, to fight<br />

freely or to march aggressively towards the goal (see 5.3.3).<br />

– For the scout behavior, the objective (�o) is set to the next pathfinder waypoint.<br />

• Dmg i∈�1...n� ∈ {no, low, medium, high}: potential damage value in direction i, relative to<br />

the unit base health points, in direction i. In our StarCraft AI, this is directly drawn from<br />

two constantly updated potential damage maps (air, ground).<br />

• A i∈�1...n� ∈ {free, small, big}: occupation of the direction i by an allied unit. The model<br />

can effectively use many values (other than “occupied/free”) because directions may be<br />

multi-scale (<strong>for</strong> instance we indexed the scale on the size of the unit) <strong>and</strong>, in the end,<br />

small <strong>and</strong>/or fast units have a much smaller footprint, collision wise, than big <strong>and</strong>/or slow.<br />

In our AI, instead of direct positions of allied units, we used their (linear) interpolation<br />

dist(unit, � di)<br />

unit.speed (time it takes the unit to go to � di) frames later (to avoid squeezing/expansion).<br />

• E i∈�1...n� ∈ {free, small, big}: occupation of the direction i by an enemy unit. As above.<br />

• Occ i∈�1...n� ∈ {free, building, staticterrain}: Occupied, repulsive effect of buildings <strong>and</strong><br />

terrain (cliffs, water, walls).<br />

There is basically one set of (sensory) variables per perception in addition to the Diri values.<br />

Decomposition<br />

The joint distribution (JD) over these variables is a specific kind of fusion called inverse programming<br />

[Le Hy et al., 2004]. The sensory variables are considered conditionally independent<br />

knowing the actions, contrary to st<strong>and</strong>ard naive <strong>Bayesian</strong> fusion, in which the sensory variables<br />

are considered independent knowing the phenomenon.<br />

P(Dir1:n, Obj1:n, Dmg1:n, A1:n, E1:n, Occ1:n) (5.1)<br />

= JD =<br />

n�<br />

P(Diri)P(Obji|Diri) (5.2)<br />

i=1<br />

P(Dmgi|Diri)P(Ai|Diri) (5.3)<br />

P(Ei|Diri)P(Occi|Diri) (5.4)<br />

We assume that the i directions are independent depending on the action because dependency<br />

is already encoded in (all) sensory inputs.<br />

80

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!