16.01.2013 Views

Developing an Artificial Intelligence for Whist - Fongboy.com

Developing an Artificial Intelligence for Whist - Fongboy.com

Developing an Artificial Intelligence for Whist - Fongboy.com

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Developing</strong> <strong>an</strong> <strong>Artificial</strong> <strong>Intelligence</strong> <strong>for</strong> <strong>Whist</strong><br />

Abstract<br />

This paper details the development of a simple artificial<br />

intelligence <strong>for</strong> playing the card game <strong>Whist</strong>.<br />

A brief introduction to the game is given, <strong>an</strong>d details<br />

are given <strong>for</strong> the design decisions that were<br />

made in the design of the AI. The results have some<br />

possible applications toward other similar games.<br />

1 Introduction<br />

<strong>Whist</strong> is <strong>an</strong> old card game that has been falling into obscurity<br />

due to the popularity of the game Bridge. However,<br />

<strong>Whist</strong> is still <strong>an</strong> interesting game to study because of its<br />

differences from Bridge. <strong>Whist</strong> is a true multiplayer game<br />

with more th<strong>an</strong> 2 players. Also, bid values are to be met<br />

exactly so there are extra <strong>com</strong>plications <strong>for</strong> avoiding going<br />

over a bid.<br />

There are m<strong>an</strong>y variations to the game <strong>Whist</strong> <strong>an</strong>d currently<br />

there is no widely accepted official version of the<br />

game. The version studied in this paper is similar to the<br />

variation known as "Oh Hell!" [1].<br />

Prior works in this area include the Ginsberg Intelligent<br />

Bridge Player (GIB) [2]. Although bridge is signific<strong>an</strong>tly<br />

different from <strong>Whist</strong>, the trick taking structure of Bridge<br />

makes some of the techniques used in the GIB to be also<br />

useful <strong>for</strong> <strong>Whist</strong>.<br />

2 Rules of <strong>Whist</strong><br />

Since <strong>Whist</strong> is a game with m<strong>an</strong>y variations, there needs to<br />

be some expl<strong>an</strong>ation of the rules of the variation that is studied<br />

in this paper.<br />

2.1 General Game Structure<br />

This variation of <strong>Whist</strong> is a game played with five players<br />

with no partnerships. A set number of rounds are played in<br />

each game. Each player takes turns dealing, with the dealer<br />

position passing in clockwise order. The number of cards<br />

dealt to each player varies with each round. In the first<br />

round, each player is dealt ten cards. In each subsequent<br />

round, one less card is dealt to each player. This continues<br />

down to one card <strong>an</strong>d then repeats from one card back to ten<br />

cards. After all players' cards are dealt, the next card in the<br />

deck is turned face up. The suit of this card determines the<br />

Jason Fong<br />

University of Cali<strong>for</strong>nia, Los Angeles<br />

Computer Science<br />

jfong@cs.ucla.edu<br />

trump suit <strong>for</strong> the round. Each player receives a score at the<br />

end of each round. The winner is the player with the highest<br />

cumulative score at the end of the 20 rounds.<br />

2.2 Turn Structure<br />

The rules of play in each turn are similar to other trick taking<br />

games such as Bridge <strong>an</strong>d Spades. In a turn, each player<br />

plays one card, going in clockwise order. The first player in<br />

a turn may play <strong>an</strong>y card in his h<strong>an</strong>d. This card determines<br />

the lead suit <strong>an</strong>d the subsequent players must play a card of<br />

the same suit if possible. If a player does not have a card of<br />

the same suit, then <strong>an</strong>y card may be played.<br />

The player that played the highest card of the same suit as<br />

the lead suit wins the trick. However, if a card of the trump<br />

suit was played, then the highest card of the trump suit will<br />

take precedence <strong>an</strong>d win the trick. The winner of a trick<br />

plays the first card of the next trick.<br />

2.3 Bidding<br />

After the cards <strong>for</strong> a round are dealt <strong>an</strong>d be<strong>for</strong>e play begins,<br />

each player looks at his cards <strong>an</strong>d makes a bid declaring<br />

how m<strong>an</strong>y tricks they will take in the round. The goal of<br />

each round is to take the exact number of tricks that were<br />

bid. Taking too few or too m<strong>an</strong>y tricks are both considered a<br />

failure to meet the bid.<br />

The sum of all bids is not permitted to equal the number<br />

of cards dealt to each player in the current round. The effect<br />

of this is to ensure that at least one player will not be able to<br />

meet their bid.<br />

Bidding starts with the person following the dealer <strong>an</strong>d<br />

proceeds in clockwise order. Since the dealer is the last to<br />

bid, his bid may be restricted because his bid c<strong>an</strong>not make<br />

the sum of the bids equal the cards dealt. After the dealer<br />

bids, he plays the first card of the first trick.<br />

2.4 Scoring<br />

At the end of a round, each player's score is calculated <strong>an</strong>d<br />

added to their cumulative score. A player receives one point<br />

<strong>for</strong> each trick taken. If the number of tricks taken is exactly<br />

equal to the number bid <strong>for</strong>, then the player receives <strong>an</strong> additional<br />

ten points. There is no extra bonus on top of the ten<br />

points <strong>for</strong> making a bid <strong>for</strong> zero tricks.


From this scoring structure we see that making or missing<br />

a bid is the most import<strong>an</strong>t <strong>com</strong>ponent of the final score.<br />

This implies that stopping other players from making their<br />

bids will factor into <strong>an</strong> effective strategy. Also, the ability of<br />

<strong>an</strong> opponent to stop a player from making his bid needs to<br />

be considered.<br />

3 Designing a Search <strong>for</strong> <strong>Whist</strong><br />

We c<strong>an</strong> design a <strong>Whist</strong> AI as a search on a tree. Each node<br />

represents a particular play order. Internal nodes of the tree<br />

are order of plays <strong>for</strong> partial games. Leaf nodes are order of<br />

plays <strong>for</strong> <strong>com</strong>plete games. A search would attempt to find<br />

the child that is most likely to lead to a leaf with a favorable<br />

result <strong>for</strong> the player to move. This child represents the next<br />

move to make <strong>for</strong> the current player.<br />

3.1 A Complete Search<br />

At a particular point in a round there are a set of cards that<br />

are known <strong>an</strong>d a set of cards that are unknown. The known<br />

set consists of the cards whose location is known. These are<br />

the cards that either in your h<strong>an</strong>d, the trump-determining<br />

card, or cards that have already been played. The unknown<br />

set consists of the cards whose location is unknown. These<br />

are the cards that are in the h<strong>an</strong>ds of the opponent players.<br />

A <strong>com</strong>plete search would consider <strong>an</strong> opponent to be able<br />

to play <strong>an</strong>y unknown card. This method will <strong>com</strong>pletely<br />

search all of the possible out<strong>com</strong>es of the game since it covers<br />

all of the possible configurations of the opponents'<br />

h<strong>an</strong>ds. However, the cost <strong>for</strong> this search is very expensive.<br />

The worse case <strong>for</strong> this search occurs in the rounds with<br />

10 cards dealt to each player. At the level where the first<br />

opponent is to move there will be a br<strong>an</strong>ching factor of 41.<br />

This is the number of unknown cards since there are 10<br />

known cards in the AI player's h<strong>an</strong>d <strong>an</strong>d one known card in<br />

the trump-determining card. In each subsequent opponent<br />

turn, the br<strong>an</strong>ching factor will reduce by one since the number<br />

of unknown cards will decrease by one whenever <strong>an</strong><br />

opponent makes a move.<br />

An approximate number of leaves in this search tree c<strong>an</strong><br />

be calculated as<br />

56<br />

41! × 10!<br />

≈ 10<br />

(1)<br />

where 41! is the number of permutations <strong>for</strong> the order of<br />

play of the opponents' cards, <strong>an</strong>d 10! is number of permutations<br />

<strong>for</strong> the order of play of the AI player's cards. This results<br />

in a very large search tree, so it is desirable to reduce<br />

this if possible.<br />

3.2 A Simplified Search<br />

The size of the search tree c<strong>an</strong> be reduced if the br<strong>an</strong>ching<br />

factor at the opponents' moves could be reduced. The<br />

br<strong>an</strong>ching factor is high because at each opponent's turn, it<br />

is assumed that <strong>an</strong>y unseen card c<strong>an</strong> be played. The number<br />

of possible moves c<strong>an</strong> be signific<strong>an</strong>tly reduced if the cards<br />

in the opponents' h<strong>an</strong>ds c<strong>an</strong> be seen.<br />

In order reduce the size of the search, we will make a<br />

simplifying assumption <strong>an</strong>d assume that the AI player is<br />

omniscient <strong>an</strong>d c<strong>an</strong> see all of the cards in all the h<strong>an</strong>ds. This<br />

will signific<strong>an</strong>tly reduce the br<strong>an</strong>ching factor of the search<br />

<strong>an</strong>d allow the search to reach a greater depth within a reasonable<br />

amount of time. In section 5.3 we will consider how<br />

to remove the omniscience assumption.<br />

With this assumption in place, the number of leaves in the<br />

search tree c<strong>an</strong> be calculated as:<br />

5<br />

32<br />

( 10 ! ) ≈ 6×<br />

10<br />

(2)<br />

where 10! is the number of permutations <strong>for</strong> the order of<br />

play of each player's cards, raised to the 5 th power since<br />

there are five players. This <strong>an</strong>alysis is not entirely correct<br />

since at each turn players must follow the lead suit <strong>an</strong>d c<strong>an</strong>not<br />

truly play <strong>an</strong>y card in their h<strong>an</strong>d. However, the equation<br />

in (2) simplifies this <strong>an</strong>alysis <strong>an</strong>d c<strong>an</strong> be considered a loose<br />

upper bound on the number of leaves.<br />

The resulting search tree is signific<strong>an</strong>tly smaller th<strong>an</strong> the<br />

tree in the <strong>com</strong>plete search. However, this is still a very<br />

large tree <strong>an</strong>d c<strong>an</strong>not be searched <strong>com</strong>pletely. In order to<br />

make a decision <strong>for</strong> the next move in a reasonable amount<br />

of time, we c<strong>an</strong> cut off the search at a particular depth <strong>an</strong>d<br />

use a heuristic evaluation function.<br />

4 Designing a Heuristic Evaluation<br />

In order to design a heuristic evaluation function, we need to<br />

determine the factors that contribute to the value of a particular<br />

game position. Since the goal of <strong>Whist</strong> is to end with<br />

the highest score, we need to consider factors that contribute<br />

to earning points.<br />

The biggest contributor to a player's score is the 10 points<br />

that are earned when a bid is successfully met. Thus, the<br />

primary <strong>com</strong>ponent of the value of the heuristic evaluation<br />

should be <strong>an</strong> estimate of the likelihood of a player making<br />

his bid. An estimate of the number of tricks that will be taken<br />

<strong>com</strong>pared to the player's bid value will give <strong>an</strong> estimate<br />

of the likelihood of making a bid.<br />

4.1 Card Classifications<br />

In order to estimate how m<strong>an</strong>y tricks will be taken, we c<strong>an</strong><br />

consider the cards held in a h<strong>an</strong>d <strong>an</strong>d classify them according<br />

to their ability to win or lose tricks. The ability to lose<br />

tricks needs to be considered since there will be situations<br />

where a player needs to avoid going over his bid.<br />

Conveniently, the omniscience assumption also makes it<br />

easier to classify the cards in a h<strong>an</strong>d. Since we c<strong>an</strong> see all<br />

cards, we c<strong>an</strong> easily determine if a card will be guar<strong>an</strong>teed<br />

to win or to lose a trick if it is played as the lead card. The<br />

card is <strong>an</strong>alyzed <strong>for</strong> when it is the lead card since considering<br />

the card in the general case <strong>for</strong> <strong>an</strong>y play position will<br />

result in card classifications that are too narrowly defined.<br />

For inst<strong>an</strong>ce, a guar<strong>an</strong>teed winner in the general case will<br />

usually only be the highest trump card. This is because even<br />

if a player has the highest card of a non-trump suit, it c<strong>an</strong>not<br />

win if that card's suit was not led. The classification being<br />

based on lead cards is not too restrictive since the classification<br />

also applies to the case where the card is played later in<br />

a trick, but the lead suit matches the card's suit.<br />

The cards are classified into five categories. The first two<br />

categories are the a<strong>for</strong>ementioned guar<strong>an</strong>teed winners <strong>an</strong>d


guar<strong>an</strong>teed losers. The next two categories are likely winners<br />

<strong>an</strong>d likely losers. These are similar to the guar<strong>an</strong>teed<br />

winners <strong>an</strong>d losers, except that the opponents' intentions are<br />

taken into consideration. If <strong>an</strong> opponent is able to win a<br />

trick but is not <strong>for</strong>ced to win it, he may not actually decide<br />

to take the trick if he is in d<strong>an</strong>ger of going over his bid.<br />

Similarly, if <strong>an</strong> opponent is able to avoid winning a trick but<br />

is not <strong>for</strong>ced to do so, he may still decide to take the trick if<br />

he needs more tricks to make his bid.<br />

The likely winner category is a looser <strong>for</strong>m of the guar<strong>an</strong>teed<br />

winner category where a card is considered a winner if<br />

no player that may intend to take a trick c<strong>an</strong> beat the card.<br />

The likely loser category is similarly related to the guar<strong>an</strong>teed<br />

loser category. An opponent's intentions are determined<br />

by considering how m<strong>an</strong>y tricks are needed to made his bid<br />

<strong>an</strong>d the number of "guar<strong>an</strong>teed" category of cards. The<br />

"likely" category of cards is not considered since this would<br />

lead to <strong>an</strong> infinite recursion.<br />

The fifth category contains the neutral cards. These include<br />

all the cards that are not part of the first four categories.<br />

These cards are not neutral in the sense that they do not<br />

contribute at all to the ability to successfully make a bid.<br />

They are neutral in the sense that they are flexible <strong>an</strong>d c<strong>an</strong><br />

be used <strong>for</strong> either winning trick or losing a trick. They are<br />

flexible because they are not dominating winner cards or<br />

weak loser cards. Thus, depending on when the player decides<br />

to use these cards, they c<strong>an</strong> be put to either use.<br />

4.2 Calculating a Heuristic Value<br />

A heuristic value c<strong>an</strong> be obtained by estimating the resulting<br />

score <strong>for</strong> the round. The first <strong>com</strong>ponent of this is <strong>an</strong> estimate<br />

of the likelihood of making a bid. This c<strong>an</strong> be obtained<br />

by using the card classifications to estimate the resulting<br />

number of tricks over or under the bid.<br />

We begin with the tricks left to make the bid. The number<br />

of guar<strong>an</strong>teed winner cards is subtracted from this. Next, the<br />

number of likely winner cards multiplied by 0.75 is subtracted.<br />

Not all of the likely winner cards will win, so only<br />

some fraction of those cards will be considered as tricks<br />

won.<br />

If the resulting value is positive, then there is probably a<br />

need to take additional tricks to make the bid. In this case,<br />

the number of neutral cards multiplied by 0.25 is subtracted.<br />

If the previous result was negative, then there is probably<br />

not a need to take additional tricks since there is a d<strong>an</strong>ger of<br />

going over the bid. Some small portion of the neutral cards<br />

still needs to be considered since opponents players are<br />

likely to try to <strong>for</strong>ce the taking of additional tricks when one<br />

is in d<strong>an</strong>ger of going over a bid. In this case, the number of<br />

neutral cards multiplied by 0.1 is subtracted.<br />

This results in the following equations:<br />

t′ r = b − t − cgw<br />

− 0.<br />

75clw<br />

(3)<br />

⎧t′<br />

′<br />

r − 0.<br />

25cn<br />

if tr<br />

> 0<br />

t r = ⎨<br />

(4)<br />

⎩ t′<br />

− 0.<br />

1 ′<br />

r cn<br />

if tr<br />

< 0<br />

where b is the bid, t is the number of tricks already taken,<br />

cgw is the number of guar<strong>an</strong>teed winners, clw is the number<br />

of likely winners, cn is the number of neutral cards, t'r is the<br />

number of tricks remaining to take to make a bid after guar<strong>an</strong>teed<br />

winners <strong>an</strong>d some portion of the likely winners are<br />

taken, <strong>an</strong>d tr is the number of tricks remaining to be taken<br />

after consider all card categories.<br />

Equations (3) <strong>an</strong>d (4) did not use the number of guar<strong>an</strong>teed<br />

loser <strong>an</strong>d likely loser cards in the calculation. This is<br />

because the tricks lost will not ch<strong>an</strong>ge the number of tricks<br />

needed to make a bid. The purpose of the guar<strong>an</strong>teed loser<br />

<strong>an</strong>d likely loser categories is to take away cards from the<br />

other categories that do factor into the calculation.<br />

The expected value of points earned <strong>for</strong> making a bid is<br />

estimated as:<br />

ch<br />

− tr<br />

pb<br />

= 10 ⋅<br />

(5)<br />

ch<br />

where ch is the number of cards remaining the h<strong>an</strong>d. This<br />

<strong>for</strong>mulation results in 10 points <strong>for</strong> meeting a bid exactly,<br />

<strong>an</strong>d reduces the value with each trick over or under the bid.<br />

The number of tricks over or under is weighed against the<br />

cards remaining to play since there is more flexibility in<br />

play when there are more cards to choose from. This flexibility<br />

makes it more likely that <strong>an</strong> unfavorable out<strong>com</strong>e c<strong>an</strong><br />

be avoided. Thus, the penalty to the estimated value of the<br />

bid points is reduced when there are more cards remaining<br />

to play.<br />

In addition to the estimated points from making a bid,<br />

there are some additional minor <strong>com</strong>ponents to consider.<br />

One point is added <strong>for</strong> each trick taken. One point is also<br />

added <strong>for</strong> each opponent who is likely to miss a bid. An<br />

opponent is considered to be likely to miss a bid if the opponent's<br />

tr value is less th<strong>an</strong> -2. The choice of this value is a<br />

bit arbitrary. It was chosen since it is usually hard to recover<br />

from a position where one is likely to take two tricks over<br />

the bid.<br />

Taking these extra considerations results in the final heuristic<br />

value c<strong>an</strong> be obtain as follows:<br />

pt = b − tr<br />

(6)<br />

h = pb<br />

+ pt<br />

+ m<br />

(7)<br />

where pt is the estimated number of points from tricks taken,<br />

m is the estimated number of opponents that will miss their<br />

bids, <strong>an</strong>d h is the final heuristic value.<br />

4.3 Choosing Card Category Weights<br />

The weights of the card categories were obtained by testing<br />

various weights in a few sample games. The values chosen<br />

<strong>for</strong> consideration were based on what I thought was reasonable<br />

<strong>for</strong> the out<strong>com</strong>es of each card type. However, these<br />

choices are still fairly arbitrary.<br />

A more rigorous approach would create multiple inst<strong>an</strong>ces<br />

of the <strong>Whist</strong> AI with different category weights <strong>an</strong>d<br />

allow the inst<strong>an</strong>ces to <strong>com</strong>pete against each other. This<br />

would provide more empirical evidence as to which weights<br />

are more effective. However, this may not give results that<br />

actually work well against hum<strong>an</strong> players since the play<br />

against AI opponents may be biased toward idiosyncrasies


of the AI's style of play. This approach toward obtaining the<br />

card category weights was not taken due to time constraints.<br />

5 Determining the Next Move<br />

The next move is determined by per<strong>for</strong>ming a search on the<br />

game tree <strong>an</strong>d obtaining values <strong>for</strong> each of the children of<br />

the root. The children of the root node represent each of the<br />

possible cards that c<strong>an</strong> be played <strong>for</strong> the next move.<br />

5.1 Search Methodology<br />

The search is per<strong>for</strong>med by using the Max n algorithm [3].<br />

This is similar to a minimax search, but adapted <strong>for</strong> h<strong>an</strong>dling<br />

more th<strong>an</strong> two players. In a game with n players, the<br />

evaluation of each node gives <strong>an</strong> n-tuple. The n-tuple is a<br />

five element array of values where each element corresponds<br />

to the evaluation of the node from the viewpoint of<br />

each player. Each player's goal is to maximize the value of<br />

his element of the tuple.<br />

When a node is evaluated, each of the children's values<br />

are considered <strong>an</strong>d the entire tuple of the child with the<br />

highest evaluation <strong>for</strong> the current player is backed up to the<br />

current node. The child values are obtained by the previously<br />

described heuristic evaluation function if the search<br />

did not reach the leaves. If the leaves are reached, then the<br />

child value is obtained by calculating the actual score <strong>for</strong> the<br />

current player. One point is given <strong>for</strong> each trick taken <strong>an</strong>d<br />

ten points are given <strong>for</strong> making the player's bid. In order to<br />

adjust <strong>for</strong> the desire to make opponents miss their bids, <strong>an</strong><br />

additional point is given <strong>for</strong> each opponent player that fails<br />

to make his bid.<br />

After the Max n search is per<strong>for</strong>med to a fixed depth, the<br />

root node will have backed up values <strong>for</strong> its children. These<br />

values represent the estimated value of each of the possible<br />

moves that c<strong>an</strong> be made. The card that should be played<br />

next is the card that is played in the child node with the<br />

highest backed up value.<br />

5.2 Search Optimization Considerations<br />

As a round of <strong>Whist</strong> progresses, the br<strong>an</strong>ching factor of the<br />

search tree decreases. This is due to the fact that as cards are<br />

played, the players have fewer cards in their h<strong>an</strong>ds to chose<br />

from. This reduction in possible moves reduces the br<strong>an</strong>ching<br />

factor of the tree. Thus, <strong>for</strong> a given amount of time, we<br />

c<strong>an</strong> usually search deeper when there are fewer cards in the<br />

players' h<strong>an</strong>ds. This is taken adv<strong>an</strong>tage of by adapting the<br />

search depth to the number of cards in the players' h<strong>an</strong>ds.<br />

During the last trick each player has only one card in their<br />

h<strong>an</strong>d. This results in <strong>for</strong>ced moves, so there is no need to<br />

search at the bottom five levels of the search tree. When the<br />

search reaches a node of height six, the last five moves are<br />

played <strong>an</strong>d the final game state is evaluated at the leaf node.<br />

This value is then backed up to the height six node.<br />

In some situations, a player may have only one legal<br />

move. This c<strong>an</strong> be the case when a player has only one card<br />

of the lead suit. When this occurs, the search need not be<br />

per<strong>for</strong>med since there is no choice of move.<br />

5.3 Removing the Omniscience Assumption<br />

The preceding <strong>an</strong>alysis on heuristics <strong>an</strong>d search methodology<br />

has been made under the assumption of <strong>an</strong> AI player<br />

with <strong>an</strong> omniscient view of the game. In a real game of<br />

<strong>Whist</strong> we c<strong>an</strong>not assume that we c<strong>an</strong> cheat <strong>an</strong>d see all cards.<br />

In order to remove the omniscience assumption we c<strong>an</strong><br />

take a probabilistic approach <strong>an</strong>d consider which card is<br />

most likely to lead to a favorable result. While a round is in<br />

progress we c<strong>an</strong> remember all of the cards that have been<br />

played. We also know the cards in our h<strong>an</strong>d <strong>an</strong>d the card<br />

that was used to determine the trump suit. This leads to a set<br />

of cards whose locations are currently unknown.<br />

A Monte-Carlo method c<strong>an</strong> then be used to arrive at <strong>an</strong><br />

educated guess as to which card will lead to a favorable<br />

result. The set of unknown cards are used to r<strong>an</strong>domly create<br />

the opponents' h<strong>an</strong>ds. The previously described search is<br />

used to determine the best card to play in this game configuration.<br />

This is repeated m<strong>an</strong>y times with different r<strong>an</strong>dom<br />

dealings of the opponent cards. A count is kept of how<br />

m<strong>an</strong>y times each of the AI player's cards were chosen as the<br />

best next move. When this process is <strong>com</strong>plete, the move to<br />

make is the card that was chosen the most times as the next<br />

best move.<br />

5.4 Search Depth vs. Deal Variations<br />

The quality of the move chosen is affected by the depth of<br />

each search <strong>an</strong>d the number of iterations in the Monte-Carlo<br />

simulation. Thinking time c<strong>an</strong> be allocated to either of these<br />

processes. In order to improve the quality of the move chosen,<br />

we need to consider how to allocate the thinking time.<br />

When there are m<strong>an</strong>y cards remaining in each h<strong>an</strong>d, the<br />

number of iterations is favored. In this case there are m<strong>an</strong>y<br />

more possible configurations of the opponent h<strong>an</strong>ds. This<br />

leads to a need <strong>for</strong> more samples to arrive at a reasonable<br />

result. Searching deeper is likely to be very expensive due to<br />

the high br<strong>an</strong>ching factor when there are m<strong>an</strong>y cards in each<br />

player's h<strong>an</strong>d. Also, the heuristic evaluations are likely to be<br />

fairly inaccurate since are m<strong>an</strong>y turns left in the game. Thus,<br />

searching more iterations is likely to be more valuable th<strong>an</strong><br />

searching deeper.<br />

When there are few cards remaining in each h<strong>an</strong>d, the<br />

search depth is favored. In this case there are fewer possible<br />

configurations of the opponent h<strong>an</strong>ds. This leads to fewer<br />

samples giving a similar coverage as the case with more<br />

cards <strong>an</strong>d more samples. Searching deeper will result in the<br />

search reaching the leaves more often. These positions are<br />

more valuable since the in<strong>for</strong>mation is based on a final<br />

game state <strong>an</strong>d not a heuristic estimate. Also, even when a<br />

leaf is not reached, the heuristic evaluations are more accurate<br />

since there are few turns left in the game.<br />

With these considerations in mind, the search depth <strong>an</strong>d<br />

iterations of the Monte-Carlo simulation are adjusted based<br />

on the number of cards remaining in each h<strong>an</strong>d.<br />

6 Evaluation Methodology<br />

Since the variation of <strong>Whist</strong> studied in this paper is not the<br />

official st<strong>an</strong>dard of <strong>Whist</strong>, there is no known existing pro-


gram that plays this game. Thus, a somewhat subjective<br />

method is needed to evaluate the effectiveness of the <strong>Whist</strong><br />

AI.<br />

The evaluation was per<strong>for</strong>med by having the AI play<br />

games against a focus group of four hum<strong>an</strong> players. This<br />

group consisted of players that are proficient at playing the<br />

studied variation of <strong>Whist</strong>. Comments were taken from the<br />

hum<strong>an</strong> players concerning blunders or brilli<strong>an</strong>t moves made<br />

by the AI. The frequency of the AI winning rounds was also<br />

observed. The reasoning behind the AI's moves was observed<br />

by <strong>an</strong>alyzing a report generated by the AI at each<br />

move. This report contained evaluations of each possible<br />

next move <strong>an</strong>d evaluations of each opponent's position. This<br />

gives some insight into the reasoning behind the AI's moves<br />

<strong>an</strong>d whether it is making some decisions based on faulty<br />

inferences.<br />

7 Results<br />

Play against the focus group suggests that the AI plays at<br />

the level of <strong>an</strong> average player. The AI plays acceptably well,<br />

but does not dominate the game <strong>an</strong>d win consistently. However,<br />

it also does not fall hopelessly behind in score.<br />

The test trials were possibly effected by opponent bias<br />

against the AI player. It is possible that the hum<strong>an</strong> players<br />

were focused on defeating the AI player <strong>an</strong>d stopping its<br />

bids. In this case the AI player may have been hindered by<br />

not being par<strong>an</strong>oid enough with respect to the opponent<br />

players' actions.<br />

8 Possible Improvements<br />

There are a number of different aspects that c<strong>an</strong> be improved<br />

to increase the AI player's strength.<br />

8.1 Heuristic Improvements<br />

The heuristic evaluation c<strong>an</strong> be possibly improved in a number<br />

of ways. The weights of the various card categories c<strong>an</strong><br />

be adjusted as previously described. The neutral card category<br />

could be divided into cards that are neutral winners <strong>an</strong>d<br />

neutral losers. This c<strong>an</strong> be determined by considering such<br />

things as a card's r<strong>an</strong>k relative to other remaining cards of<br />

the same suit, or how m<strong>an</strong>y cards of the same suit have already<br />

been played.<br />

Protected high cards c<strong>an</strong> also be considered. If a player<br />

has low cards of the same suit as a high card, the high card<br />

is not as d<strong>an</strong>gerous when trying to avoid going over a bid.<br />

This is because the player c<strong>an</strong> use the low card when <strong>for</strong>ced<br />

to play the suit <strong>an</strong>d try to get rid of the high card at a later<br />

point when the suit is not led.<br />

Improvements in the heuristic c<strong>an</strong> lead to more intelligent<br />

choices of moves.<br />

8.2 Intelligent Dealing of Opponent H<strong>an</strong>ds<br />

The opponent h<strong>an</strong>ds c<strong>an</strong> possibly be dealt with more intelligence<br />

th<strong>an</strong> just a r<strong>an</strong>dom deal. The cards c<strong>an</strong> be dealt to<br />

match the bids that were made by the opponents. However,<br />

there needs to be care taken to consider that <strong>an</strong> opponent's<br />

tricks remaining c<strong>an</strong> be incorrect if one of his winner or<br />

loser cards gave the opposite result. This will result in the<br />

number of tricks remaining to be incorrect <strong>for</strong> the opponent<br />

player.<br />

It is possible to adjust <strong>for</strong> this by observing the cards<br />

played <strong>an</strong>d attempting to predict when <strong>an</strong> opponent gets the<br />

wrong result <strong>for</strong> a card played. However, this is quite a<br />

<strong>com</strong>plicated endeavor that time restrictions made impractical.<br />

8.3 Automated Bidding<br />

The issue of automated bidding was considered. A possible<br />

algorithm is to consider each bid possibility as a separate<br />

search <strong>an</strong>d get the backed up value at the root <strong>for</strong> each of<br />

these searches. The best bid is then the bid made in the<br />

search with the best backed up value.<br />

This approach was attempted but the results were not<br />

very effective. The AI tended to make bids that were not<br />

very reasonable <strong>for</strong> the cards in its h<strong>an</strong>d. This is the result of<br />

not considering other import<strong>an</strong>t factors when making a bid.<br />

These other factors could include low cards protecting high<br />

cards, long suits, short suits, ability to give or take the lead.<br />

These are considerations that affect <strong>an</strong> intelligent bid but are<br />

not considered in the heuristic evaluation function.<br />

In order to focus on the AI's card play ability, I chose to<br />

m<strong>an</strong>ually enter the AI player's bid. This helps to remove the<br />

concern of bid quality affecting the card play results. This is<br />

not too unreasonable <strong>for</strong> <strong>an</strong> evaluation method based on<br />

play with hum<strong>an</strong>s since there is not a need to quickly make<br />

m<strong>an</strong>y bids.<br />

9 Conclusions<br />

A somewhat respectable AI <strong>Whist</strong> player was produced. It is<br />

not a dominating player, but it plays reasonably well <strong>an</strong>d<br />

leaves much room <strong>for</strong> future improvements. The card classification<br />

system appeared to be effective <strong>for</strong> <strong>Whist</strong>. This<br />

suggests possible applications in other trick taking games.<br />

The method of peeking at opponent cards <strong>com</strong>bined with the<br />

Monte-Carlo simulation appeared to be effective <strong>for</strong> <strong>Whist</strong>.<br />

This suggests possible applications in other imperfect in<strong>for</strong>mation<br />

card games.<br />

References<br />

[1] Wikipedia, "Oh Hell,"<br />

http://en.wikipedia.org/wiki/Oh_Hell<br />

[2] Ginsberg, M. L., "GIB research page,"<br />

http://www.cirl.uoregon.edu/ginsberg/gibresearch.html<br />

[3] Luckhardt, C. A., <strong>an</strong>d Ir<strong>an</strong>i, K. B., An algorithmic solution<br />

of N-person games, Proceeding of the National<br />

Conference on Aritificial <strong>Intelligence</strong> (AAAI-86), Philadelphia,<br />

PA, August, 1986, pp. 158-162

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!