07.03.2014 Views

The MP-MIX algorithm: Dynamic Search Strategy Selection in Multi ...

The MP-MIX algorithm: Dynamic Search Strategy Selection in Multi ...

The MP-MIX algorithm: Dynamic Search Strategy Selection in Multi ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

12<br />

<strong>Search</strong> Strategies %<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

1 2 3 4 5<br />

par max off<br />

<strong>Search</strong> Depth<br />

<strong>Search</strong> Strategies %<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

1 2 3 4 5<br />

par max off<br />

<strong>Search</strong> Depth<br />

Fig. 17.<br />

Percentage of search strategies {<strong>MP</strong>-Mix vs. MaxN}<br />

Fig. 19.<br />

Percentage of search strategies {<strong>MP</strong>-Mix vs. 3 RND}<br />

<strong>Search</strong> Strategies %<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

Fig. 18.<br />

0%<br />

1 2 3 4 5<br />

par max off<br />

<strong>Search</strong> Depth<br />

Percentage of search strategies {<strong>MP</strong>-Mix vs. Paranoid}<br />

evaluate and compare the <strong>in</strong>dividual performance of the different<br />

search <strong>algorithm</strong>s aga<strong>in</strong>st three RND opponents. Recall<br />

that a RND player is a player whose preferences were randomized<br />

to reflect a wider range of social behavior than the<br />

simple rational, self-maximiz<strong>in</strong>g players, as is often observed<br />

<strong>in</strong> human play<strong>in</strong>g strategies. Moreover, runn<strong>in</strong>g aga<strong>in</strong>st 3 RND<br />

players allow us to explore one depth deeper (that is to depth<br />

6), and observe the performances at this depth.<br />

<strong>The</strong> results are depicted <strong>in</strong> Figure 16. Each column represents<br />

the w<strong>in</strong>n<strong>in</strong>g percentage of the respective player aga<strong>in</strong>st<br />

3 RND players. We can see that the <strong>MIX</strong> player atta<strong>in</strong>s higher<br />

w<strong>in</strong>n<strong>in</strong>g percentage than the other two <strong>algorithm</strong>s (except for<br />

depth 3, <strong>in</strong> which MAXN’s with 45.2%, was slightly higher<br />

than <strong>MIX</strong>’s, with 44.8%. However, these improvements do<br />

not appear to be as significant as <strong>in</strong> the direct match-up<br />

experiments, probably due to the limited competence of the<br />

opponents. We see a significant improvement <strong>in</strong> depth 6, where<br />

the <strong>MIX</strong> player managed to atta<strong>in</strong> 51% w<strong>in</strong>n<strong>in</strong>g compared to<br />

the 37% and 35.2% of MAXN and PAR, respectively.<br />

Experiment 5: Analyz<strong>in</strong>g <strong>MP</strong>-Mix’s behavior<br />

Another <strong>in</strong>terest<strong>in</strong>g question is regard<strong>in</strong>g the quantity of<br />

the different searches that the <strong>MP</strong>-Mix <strong>algorithm</strong> performed.<br />

In other words, how do the amount of Paranoid, MaxN<br />

and Offensive search procedures change with different search<br />

depths and opponents’ types. Figures 17,18, and 19 show<br />

the percentages of each of the different node propagation<br />

strategies <strong>in</strong> the Quoridor game, for the experiments depicted<br />

<strong>in</strong> Figures 14, 15 and 16, respectively. <strong>The</strong>se figures show<br />

a few <strong>in</strong>terest<strong>in</strong>g <strong>in</strong>sights: First, it seems that the MaxN<br />

procedure was the most frequently used <strong>in</strong> all three sett<strong>in</strong>gs,<br />

as it accounts for about 45% of the searches (with some<br />

variance to depth and opponents’ types). This seems <strong>in</strong>tuitive<br />

as this is the “default” search procedure embedded <strong>in</strong> the<br />

meta-decision <strong>algorithm</strong>. Second, it also seem that beyond<br />

search depth of 2, the rema<strong>in</strong><strong>in</strong>g 55% are equally split between<br />

Paranoid and Offensive node propagation strategies. F<strong>in</strong>ally,<br />

all the figures demonstrate a small, but monotonic <strong>in</strong>crease<br />

<strong>in</strong> the percentage of Paranoid searches with the <strong>in</strong>crease <strong>in</strong><br />

depth search. Naturally, these graphs will change with different<br />

heuristic function and threshold values. However, they do give<br />

us some sense of the amount of different searches that were<br />

conducted by the <strong>MP</strong>-Mix agent <strong>in</strong> this doma<strong>in</strong>.<br />

V. OPPONENT I<strong>MP</strong>ACT<br />

<strong>The</strong> experimental results clearly <strong>in</strong>dicate that <strong>MP</strong>-Mix improved<br />

the players’ performances. However, we can see that<br />

the improvement <strong>in</strong> Risk and Quoridor were much more<br />

impressive than <strong>in</strong> the Hearts doma<strong>in</strong>. As such, an emerg<strong>in</strong>g<br />

question is under what conditions will the <strong>MP</strong>-Mix <strong>algorithm</strong><br />

be more effective and advantageous? To try and answer this<br />

question, we present the Opponent Impact (OI) factor, which<br />

measures the impact that a s<strong>in</strong>gle player has on the outcome<br />

of the other players. <strong>The</strong> Opponent Impact factor is a meta<br />

property which characterizes a game accord<strong>in</strong>g to its players<br />

ability to take actions that will reduce the evaluation of their<br />

opponents. This property is related both to the game and to<br />

the evaluation function that is used.<br />

A. Intuition and Def<strong>in</strong>ition<br />

<strong>The</strong> <strong>in</strong>tuition beh<strong>in</strong>d the opponent impact is that certa<strong>in</strong><br />

games were designed <strong>in</strong> a way that it is reasonably easy to<br />

impede the efforts of an opponent from w<strong>in</strong>n<strong>in</strong>g the game,<br />

while <strong>in</strong> other games those possibilities are fairly limited.<br />

For <strong>in</strong>stance, <strong>in</strong> Quoridor, as long as the player owns wall<br />

pieces, he can explicitly impede an opponent by plac<strong>in</strong>g them<br />

<strong>in</strong> his path. In contrast, <strong>in</strong> the game of Hearts, giv<strong>in</strong>g po<strong>in</strong>ts<br />

to a specific opponent often requires the fulfillment of several<br />

preconditions, that are often atta<strong>in</strong>able only through implicit<br />

cooperative strategic behavior among players.<br />

Before present<strong>in</strong>g our proposed def<strong>in</strong>ition of the Opponent<br />

Impact, let us first take a step back and discuss the notion

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!