15.12.2012 Views

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

Bayesian Programming and Learning for Multi-Player Video Games ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

μ<br />

σ<br />

K'<br />

K'<br />

EC t<br />

EU<br />

Q<br />

Army mixture<br />

(GMM)<br />

What we infer<br />

V'<br />

ETT<br />

What we see What we want to know<br />

EU<br />

What we want<br />

to do (tactics)<br />

U<br />

K'<br />

K' K K<br />

V<br />

EC t<br />

K<br />

EC t+1 C c<br />

Figure 7.14: Left: the full Gaussian mixture model <strong>for</strong> Q armies (Q battles) <strong>for</strong> the enemy,<br />

with K ′ mixtures components. Right: Graphical (plate notation) representation of the army<br />

composition <strong>Bayesian</strong> model, gray variables are known while <strong>for</strong> gray hatched variables we have<br />

distributions on their values. Ct can also be known (if a decision was taken at the tactical model<br />

level).<br />

• P(C t+1<br />

c |EC t+1 = ec) = Categorical(K, pec), P(C t+1<br />

c |EC t+1 ) is a probability table of<br />

dimensions K × K ′ , which is learned from battles with a soft Laplace’s law of succession<br />

on victories.<br />

• P(C t+1<br />

t ) = Categorical(K, ptac) (histogram on the K clusters values) coming from the<br />

tactical model. Note that it can be degenerate: P(C t+1<br />

t = shuttle) = 1.0. It serves the<br />

purpose of merging tactical needs with strategic ones.<br />

• P(C t+1<br />

m |C t+1<br />

t<br />

, C t+1<br />

c ) = αP(C t+1<br />

t )+(1−α)P(Ct+1 c ) with α ∈ [0 . . . 1] the aggressiveness/ini-<br />

tiative parameter which can be set fixed, learned, or be variable (P (α|situation)). If α = 1<br />

we only produce units wanted by the tactical model, if α = 0 we only produce units that<br />

adapts our army to the opponent’s army composition.<br />

• P(λ|C t+1<br />

f , C t+1<br />

m ) is a functional Dirac that restricts Cm values to the ones that can co-exist<br />

with our tech tree (Cf ).<br />

C t<br />

P(λ = 1|Cf , Cm = cm)<br />

= 1 iff P(Cf = cm) �= 0<br />

= 0 else<br />

This was simply implemented as a function. This is not strictly necessary (as one could<br />

have P(Cf |T T, Cm) which would do the same <strong>for</strong> P(Cf ) = 0.0) but it allows us to have the<br />

same table <strong>for</strong>m <strong>for</strong> P(Cf |T T ) than <strong>for</strong> P(EC|ET T ), should we wish to use the learned<br />

co-occurrences tables (with respect to the good race).<br />

• P(C t+1<br />

f |T T ) is either a learned probability table, as <strong>for</strong> P (EC t+1 |ET T t ), or a functional<br />

Dirac distribution which tells which Cf values (cf ) are compatible with the current tech<br />

tree T T = tt. We used this second option. A given c is compatible with the tech tree<br />

148<br />

C f<br />

K<br />

C m<br />

TT<br />

0,1<br />

λ<br />

What we have

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!