Developing an Artificial Intelligence for Whist - Fongboy.com
Developing an Artificial Intelligence for Whist - Fongboy.com
Developing an Artificial Intelligence for Whist - Fongboy.com
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Developing</strong> <strong>an</strong> <strong>Artificial</strong> <strong>Intelligence</strong> <strong>for</strong> <strong>Whist</strong><br />
Abstract<br />
This paper details the development of a simple artificial<br />
intelligence <strong>for</strong> playing the card game <strong>Whist</strong>.<br />
A brief introduction to the game is given, <strong>an</strong>d details<br />
are given <strong>for</strong> the design decisions that were<br />
made in the design of the AI. The results have some<br />
possible applications toward other similar games.<br />
1 Introduction<br />
<strong>Whist</strong> is <strong>an</strong> old card game that has been falling into obscurity<br />
due to the popularity of the game Bridge. However,<br />
<strong>Whist</strong> is still <strong>an</strong> interesting game to study because of its<br />
differences from Bridge. <strong>Whist</strong> is a true multiplayer game<br />
with more th<strong>an</strong> 2 players. Also, bid values are to be met<br />
exactly so there are extra <strong>com</strong>plications <strong>for</strong> avoiding going<br />
over a bid.<br />
There are m<strong>an</strong>y variations to the game <strong>Whist</strong> <strong>an</strong>d currently<br />
there is no widely accepted official version of the<br />
game. The version studied in this paper is similar to the<br />
variation known as "Oh Hell!" [1].<br />
Prior works in this area include the Ginsberg Intelligent<br />
Bridge Player (GIB) [2]. Although bridge is signific<strong>an</strong>tly<br />
different from <strong>Whist</strong>, the trick taking structure of Bridge<br />
makes some of the techniques used in the GIB to be also<br />
useful <strong>for</strong> <strong>Whist</strong>.<br />
2 Rules of <strong>Whist</strong><br />
Since <strong>Whist</strong> is a game with m<strong>an</strong>y variations, there needs to<br />
be some expl<strong>an</strong>ation of the rules of the variation that is studied<br />
in this paper.<br />
2.1 General Game Structure<br />
This variation of <strong>Whist</strong> is a game played with five players<br />
with no partnerships. A set number of rounds are played in<br />
each game. Each player takes turns dealing, with the dealer<br />
position passing in clockwise order. The number of cards<br />
dealt to each player varies with each round. In the first<br />
round, each player is dealt ten cards. In each subsequent<br />
round, one less card is dealt to each player. This continues<br />
down to one card <strong>an</strong>d then repeats from one card back to ten<br />
cards. After all players' cards are dealt, the next card in the<br />
deck is turned face up. The suit of this card determines the<br />
Jason Fong<br />
University of Cali<strong>for</strong>nia, Los Angeles<br />
Computer Science<br />
jfong@cs.ucla.edu<br />
trump suit <strong>for</strong> the round. Each player receives a score at the<br />
end of each round. The winner is the player with the highest<br />
cumulative score at the end of the 20 rounds.<br />
2.2 Turn Structure<br />
The rules of play in each turn are similar to other trick taking<br />
games such as Bridge <strong>an</strong>d Spades. In a turn, each player<br />
plays one card, going in clockwise order. The first player in<br />
a turn may play <strong>an</strong>y card in his h<strong>an</strong>d. This card determines<br />
the lead suit <strong>an</strong>d the subsequent players must play a card of<br />
the same suit if possible. If a player does not have a card of<br />
the same suit, then <strong>an</strong>y card may be played.<br />
The player that played the highest card of the same suit as<br />
the lead suit wins the trick. However, if a card of the trump<br />
suit was played, then the highest card of the trump suit will<br />
take precedence <strong>an</strong>d win the trick. The winner of a trick<br />
plays the first card of the next trick.<br />
2.3 Bidding<br />
After the cards <strong>for</strong> a round are dealt <strong>an</strong>d be<strong>for</strong>e play begins,<br />
each player looks at his cards <strong>an</strong>d makes a bid declaring<br />
how m<strong>an</strong>y tricks they will take in the round. The goal of<br />
each round is to take the exact number of tricks that were<br />
bid. Taking too few or too m<strong>an</strong>y tricks are both considered a<br />
failure to meet the bid.<br />
The sum of all bids is not permitted to equal the number<br />
of cards dealt to each player in the current round. The effect<br />
of this is to ensure that at least one player will not be able to<br />
meet their bid.<br />
Bidding starts with the person following the dealer <strong>an</strong>d<br />
proceeds in clockwise order. Since the dealer is the last to<br />
bid, his bid may be restricted because his bid c<strong>an</strong>not make<br />
the sum of the bids equal the cards dealt. After the dealer<br />
bids, he plays the first card of the first trick.<br />
2.4 Scoring<br />
At the end of a round, each player's score is calculated <strong>an</strong>d<br />
added to their cumulative score. A player receives one point<br />
<strong>for</strong> each trick taken. If the number of tricks taken is exactly<br />
equal to the number bid <strong>for</strong>, then the player receives <strong>an</strong> additional<br />
ten points. There is no extra bonus on top of the ten<br />
points <strong>for</strong> making a bid <strong>for</strong> zero tricks.
From this scoring structure we see that making or missing<br />
a bid is the most import<strong>an</strong>t <strong>com</strong>ponent of the final score.<br />
This implies that stopping other players from making their<br />
bids will factor into <strong>an</strong> effective strategy. Also, the ability of<br />
<strong>an</strong> opponent to stop a player from making his bid needs to<br />
be considered.<br />
3 Designing a Search <strong>for</strong> <strong>Whist</strong><br />
We c<strong>an</strong> design a <strong>Whist</strong> AI as a search on a tree. Each node<br />
represents a particular play order. Internal nodes of the tree<br />
are order of plays <strong>for</strong> partial games. Leaf nodes are order of<br />
plays <strong>for</strong> <strong>com</strong>plete games. A search would attempt to find<br />
the child that is most likely to lead to a leaf with a favorable<br />
result <strong>for</strong> the player to move. This child represents the next<br />
move to make <strong>for</strong> the current player.<br />
3.1 A Complete Search<br />
At a particular point in a round there are a set of cards that<br />
are known <strong>an</strong>d a set of cards that are unknown. The known<br />
set consists of the cards whose location is known. These are<br />
the cards that either in your h<strong>an</strong>d, the trump-determining<br />
card, or cards that have already been played. The unknown<br />
set consists of the cards whose location is unknown. These<br />
are the cards that are in the h<strong>an</strong>ds of the opponent players.<br />
A <strong>com</strong>plete search would consider <strong>an</strong> opponent to be able<br />
to play <strong>an</strong>y unknown card. This method will <strong>com</strong>pletely<br />
search all of the possible out<strong>com</strong>es of the game since it covers<br />
all of the possible configurations of the opponents'<br />
h<strong>an</strong>ds. However, the cost <strong>for</strong> this search is very expensive.<br />
The worse case <strong>for</strong> this search occurs in the rounds with<br />
10 cards dealt to each player. At the level where the first<br />
opponent is to move there will be a br<strong>an</strong>ching factor of 41.<br />
This is the number of unknown cards since there are 10<br />
known cards in the AI player's h<strong>an</strong>d <strong>an</strong>d one known card in<br />
the trump-determining card. In each subsequent opponent<br />
turn, the br<strong>an</strong>ching factor will reduce by one since the number<br />
of unknown cards will decrease by one whenever <strong>an</strong><br />
opponent makes a move.<br />
An approximate number of leaves in this search tree c<strong>an</strong><br />
be calculated as<br />
56<br />
41! × 10!<br />
≈ 10<br />
(1)<br />
where 41! is the number of permutations <strong>for</strong> the order of<br />
play of the opponents' cards, <strong>an</strong>d 10! is number of permutations<br />
<strong>for</strong> the order of play of the AI player's cards. This results<br />
in a very large search tree, so it is desirable to reduce<br />
this if possible.<br />
3.2 A Simplified Search<br />
The size of the search tree c<strong>an</strong> be reduced if the br<strong>an</strong>ching<br />
factor at the opponents' moves could be reduced. The<br />
br<strong>an</strong>ching factor is high because at each opponent's turn, it<br />
is assumed that <strong>an</strong>y unseen card c<strong>an</strong> be played. The number<br />
of possible moves c<strong>an</strong> be signific<strong>an</strong>tly reduced if the cards<br />
in the opponents' h<strong>an</strong>ds c<strong>an</strong> be seen.<br />
In order reduce the size of the search, we will make a<br />
simplifying assumption <strong>an</strong>d assume that the AI player is<br />
omniscient <strong>an</strong>d c<strong>an</strong> see all of the cards in all the h<strong>an</strong>ds. This<br />
will signific<strong>an</strong>tly reduce the br<strong>an</strong>ching factor of the search<br />
<strong>an</strong>d allow the search to reach a greater depth within a reasonable<br />
amount of time. In section 5.3 we will consider how<br />
to remove the omniscience assumption.<br />
With this assumption in place, the number of leaves in the<br />
search tree c<strong>an</strong> be calculated as:<br />
5<br />
32<br />
( 10 ! ) ≈ 6×<br />
10<br />
(2)<br />
where 10! is the number of permutations <strong>for</strong> the order of<br />
play of each player's cards, raised to the 5 th power since<br />
there are five players. This <strong>an</strong>alysis is not entirely correct<br />
since at each turn players must follow the lead suit <strong>an</strong>d c<strong>an</strong>not<br />
truly play <strong>an</strong>y card in their h<strong>an</strong>d. However, the equation<br />
in (2) simplifies this <strong>an</strong>alysis <strong>an</strong>d c<strong>an</strong> be considered a loose<br />
upper bound on the number of leaves.<br />
The resulting search tree is signific<strong>an</strong>tly smaller th<strong>an</strong> the<br />
tree in the <strong>com</strong>plete search. However, this is still a very<br />
large tree <strong>an</strong>d c<strong>an</strong>not be searched <strong>com</strong>pletely. In order to<br />
make a decision <strong>for</strong> the next move in a reasonable amount<br />
of time, we c<strong>an</strong> cut off the search at a particular depth <strong>an</strong>d<br />
use a heuristic evaluation function.<br />
4 Designing a Heuristic Evaluation<br />
In order to design a heuristic evaluation function, we need to<br />
determine the factors that contribute to the value of a particular<br />
game position. Since the goal of <strong>Whist</strong> is to end with<br />
the highest score, we need to consider factors that contribute<br />
to earning points.<br />
The biggest contributor to a player's score is the 10 points<br />
that are earned when a bid is successfully met. Thus, the<br />
primary <strong>com</strong>ponent of the value of the heuristic evaluation<br />
should be <strong>an</strong> estimate of the likelihood of a player making<br />
his bid. An estimate of the number of tricks that will be taken<br />
<strong>com</strong>pared to the player's bid value will give <strong>an</strong> estimate<br />
of the likelihood of making a bid.<br />
4.1 Card Classifications<br />
In order to estimate how m<strong>an</strong>y tricks will be taken, we c<strong>an</strong><br />
consider the cards held in a h<strong>an</strong>d <strong>an</strong>d classify them according<br />
to their ability to win or lose tricks. The ability to lose<br />
tricks needs to be considered since there will be situations<br />
where a player needs to avoid going over his bid.<br />
Conveniently, the omniscience assumption also makes it<br />
easier to classify the cards in a h<strong>an</strong>d. Since we c<strong>an</strong> see all<br />
cards, we c<strong>an</strong> easily determine if a card will be guar<strong>an</strong>teed<br />
to win or to lose a trick if it is played as the lead card. The<br />
card is <strong>an</strong>alyzed <strong>for</strong> when it is the lead card since considering<br />
the card in the general case <strong>for</strong> <strong>an</strong>y play position will<br />
result in card classifications that are too narrowly defined.<br />
For inst<strong>an</strong>ce, a guar<strong>an</strong>teed winner in the general case will<br />
usually only be the highest trump card. This is because even<br />
if a player has the highest card of a non-trump suit, it c<strong>an</strong>not<br />
win if that card's suit was not led. The classification being<br />
based on lead cards is not too restrictive since the classification<br />
also applies to the case where the card is played later in<br />
a trick, but the lead suit matches the card's suit.<br />
The cards are classified into five categories. The first two<br />
categories are the a<strong>for</strong>ementioned guar<strong>an</strong>teed winners <strong>an</strong>d
guar<strong>an</strong>teed losers. The next two categories are likely winners<br />
<strong>an</strong>d likely losers. These are similar to the guar<strong>an</strong>teed<br />
winners <strong>an</strong>d losers, except that the opponents' intentions are<br />
taken into consideration. If <strong>an</strong> opponent is able to win a<br />
trick but is not <strong>for</strong>ced to win it, he may not actually decide<br />
to take the trick if he is in d<strong>an</strong>ger of going over his bid.<br />
Similarly, if <strong>an</strong> opponent is able to avoid winning a trick but<br />
is not <strong>for</strong>ced to do so, he may still decide to take the trick if<br />
he needs more tricks to make his bid.<br />
The likely winner category is a looser <strong>for</strong>m of the guar<strong>an</strong>teed<br />
winner category where a card is considered a winner if<br />
no player that may intend to take a trick c<strong>an</strong> beat the card.<br />
The likely loser category is similarly related to the guar<strong>an</strong>teed<br />
loser category. An opponent's intentions are determined<br />
by considering how m<strong>an</strong>y tricks are needed to made his bid<br />
<strong>an</strong>d the number of "guar<strong>an</strong>teed" category of cards. The<br />
"likely" category of cards is not considered since this would<br />
lead to <strong>an</strong> infinite recursion.<br />
The fifth category contains the neutral cards. These include<br />
all the cards that are not part of the first four categories.<br />
These cards are not neutral in the sense that they do not<br />
contribute at all to the ability to successfully make a bid.<br />
They are neutral in the sense that they are flexible <strong>an</strong>d c<strong>an</strong><br />
be used <strong>for</strong> either winning trick or losing a trick. They are<br />
flexible because they are not dominating winner cards or<br />
weak loser cards. Thus, depending on when the player decides<br />
to use these cards, they c<strong>an</strong> be put to either use.<br />
4.2 Calculating a Heuristic Value<br />
A heuristic value c<strong>an</strong> be obtained by estimating the resulting<br />
score <strong>for</strong> the round. The first <strong>com</strong>ponent of this is <strong>an</strong> estimate<br />
of the likelihood of making a bid. This c<strong>an</strong> be obtained<br />
by using the card classifications to estimate the resulting<br />
number of tricks over or under the bid.<br />
We begin with the tricks left to make the bid. The number<br />
of guar<strong>an</strong>teed winner cards is subtracted from this. Next, the<br />
number of likely winner cards multiplied by 0.75 is subtracted.<br />
Not all of the likely winner cards will win, so only<br />
some fraction of those cards will be considered as tricks<br />
won.<br />
If the resulting value is positive, then there is probably a<br />
need to take additional tricks to make the bid. In this case,<br />
the number of neutral cards multiplied by 0.25 is subtracted.<br />
If the previous result was negative, then there is probably<br />
not a need to take additional tricks since there is a d<strong>an</strong>ger of<br />
going over the bid. Some small portion of the neutral cards<br />
still needs to be considered since opponents players are<br />
likely to try to <strong>for</strong>ce the taking of additional tricks when one<br />
is in d<strong>an</strong>ger of going over a bid. In this case, the number of<br />
neutral cards multiplied by 0.1 is subtracted.<br />
This results in the following equations:<br />
t′ r = b − t − cgw<br />
− 0.<br />
75clw<br />
(3)<br />
⎧t′<br />
′<br />
r − 0.<br />
25cn<br />
if tr<br />
> 0<br />
t r = ⎨<br />
(4)<br />
⎩ t′<br />
− 0.<br />
1 ′<br />
r cn<br />
if tr<br />
< 0<br />
where b is the bid, t is the number of tricks already taken,<br />
cgw is the number of guar<strong>an</strong>teed winners, clw is the number<br />
of likely winners, cn is the number of neutral cards, t'r is the<br />
number of tricks remaining to take to make a bid after guar<strong>an</strong>teed<br />
winners <strong>an</strong>d some portion of the likely winners are<br />
taken, <strong>an</strong>d tr is the number of tricks remaining to be taken<br />
after consider all card categories.<br />
Equations (3) <strong>an</strong>d (4) did not use the number of guar<strong>an</strong>teed<br />
loser <strong>an</strong>d likely loser cards in the calculation. This is<br />
because the tricks lost will not ch<strong>an</strong>ge the number of tricks<br />
needed to make a bid. The purpose of the guar<strong>an</strong>teed loser<br />
<strong>an</strong>d likely loser categories is to take away cards from the<br />
other categories that do factor into the calculation.<br />
The expected value of points earned <strong>for</strong> making a bid is<br />
estimated as:<br />
ch<br />
− tr<br />
pb<br />
= 10 ⋅<br />
(5)<br />
ch<br />
where ch is the number of cards remaining the h<strong>an</strong>d. This<br />
<strong>for</strong>mulation results in 10 points <strong>for</strong> meeting a bid exactly,<br />
<strong>an</strong>d reduces the value with each trick over or under the bid.<br />
The number of tricks over or under is weighed against the<br />
cards remaining to play since there is more flexibility in<br />
play when there are more cards to choose from. This flexibility<br />
makes it more likely that <strong>an</strong> unfavorable out<strong>com</strong>e c<strong>an</strong><br />
be avoided. Thus, the penalty to the estimated value of the<br />
bid points is reduced when there are more cards remaining<br />
to play.<br />
In addition to the estimated points from making a bid,<br />
there are some additional minor <strong>com</strong>ponents to consider.<br />
One point is added <strong>for</strong> each trick taken. One point is also<br />
added <strong>for</strong> each opponent who is likely to miss a bid. An<br />
opponent is considered to be likely to miss a bid if the opponent's<br />
tr value is less th<strong>an</strong> -2. The choice of this value is a<br />
bit arbitrary. It was chosen since it is usually hard to recover<br />
from a position where one is likely to take two tricks over<br />
the bid.<br />
Taking these extra considerations results in the final heuristic<br />
value c<strong>an</strong> be obtain as follows:<br />
pt = b − tr<br />
(6)<br />
h = pb<br />
+ pt<br />
+ m<br />
(7)<br />
where pt is the estimated number of points from tricks taken,<br />
m is the estimated number of opponents that will miss their<br />
bids, <strong>an</strong>d h is the final heuristic value.<br />
4.3 Choosing Card Category Weights<br />
The weights of the card categories were obtained by testing<br />
various weights in a few sample games. The values chosen<br />
<strong>for</strong> consideration were based on what I thought was reasonable<br />
<strong>for</strong> the out<strong>com</strong>es of each card type. However, these<br />
choices are still fairly arbitrary.<br />
A more rigorous approach would create multiple inst<strong>an</strong>ces<br />
of the <strong>Whist</strong> AI with different category weights <strong>an</strong>d<br />
allow the inst<strong>an</strong>ces to <strong>com</strong>pete against each other. This<br />
would provide more empirical evidence as to which weights<br />
are more effective. However, this may not give results that<br />
actually work well against hum<strong>an</strong> players since the play<br />
against AI opponents may be biased toward idiosyncrasies
of the AI's style of play. This approach toward obtaining the<br />
card category weights was not taken due to time constraints.<br />
5 Determining the Next Move<br />
The next move is determined by per<strong>for</strong>ming a search on the<br />
game tree <strong>an</strong>d obtaining values <strong>for</strong> each of the children of<br />
the root. The children of the root node represent each of the<br />
possible cards that c<strong>an</strong> be played <strong>for</strong> the next move.<br />
5.1 Search Methodology<br />
The search is per<strong>for</strong>med by using the Max n algorithm [3].<br />
This is similar to a minimax search, but adapted <strong>for</strong> h<strong>an</strong>dling<br />
more th<strong>an</strong> two players. In a game with n players, the<br />
evaluation of each node gives <strong>an</strong> n-tuple. The n-tuple is a<br />
five element array of values where each element corresponds<br />
to the evaluation of the node from the viewpoint of<br />
each player. Each player's goal is to maximize the value of<br />
his element of the tuple.<br />
When a node is evaluated, each of the children's values<br />
are considered <strong>an</strong>d the entire tuple of the child with the<br />
highest evaluation <strong>for</strong> the current player is backed up to the<br />
current node. The child values are obtained by the previously<br />
described heuristic evaluation function if the search<br />
did not reach the leaves. If the leaves are reached, then the<br />
child value is obtained by calculating the actual score <strong>for</strong> the<br />
current player. One point is given <strong>for</strong> each trick taken <strong>an</strong>d<br />
ten points are given <strong>for</strong> making the player's bid. In order to<br />
adjust <strong>for</strong> the desire to make opponents miss their bids, <strong>an</strong><br />
additional point is given <strong>for</strong> each opponent player that fails<br />
to make his bid.<br />
After the Max n search is per<strong>for</strong>med to a fixed depth, the<br />
root node will have backed up values <strong>for</strong> its children. These<br />
values represent the estimated value of each of the possible<br />
moves that c<strong>an</strong> be made. The card that should be played<br />
next is the card that is played in the child node with the<br />
highest backed up value.<br />
5.2 Search Optimization Considerations<br />
As a round of <strong>Whist</strong> progresses, the br<strong>an</strong>ching factor of the<br />
search tree decreases. This is due to the fact that as cards are<br />
played, the players have fewer cards in their h<strong>an</strong>ds to chose<br />
from. This reduction in possible moves reduces the br<strong>an</strong>ching<br />
factor of the tree. Thus, <strong>for</strong> a given amount of time, we<br />
c<strong>an</strong> usually search deeper when there are fewer cards in the<br />
players' h<strong>an</strong>ds. This is taken adv<strong>an</strong>tage of by adapting the<br />
search depth to the number of cards in the players' h<strong>an</strong>ds.<br />
During the last trick each player has only one card in their<br />
h<strong>an</strong>d. This results in <strong>for</strong>ced moves, so there is no need to<br />
search at the bottom five levels of the search tree. When the<br />
search reaches a node of height six, the last five moves are<br />
played <strong>an</strong>d the final game state is evaluated at the leaf node.<br />
This value is then backed up to the height six node.<br />
In some situations, a player may have only one legal<br />
move. This c<strong>an</strong> be the case when a player has only one card<br />
of the lead suit. When this occurs, the search need not be<br />
per<strong>for</strong>med since there is no choice of move.<br />
5.3 Removing the Omniscience Assumption<br />
The preceding <strong>an</strong>alysis on heuristics <strong>an</strong>d search methodology<br />
has been made under the assumption of <strong>an</strong> AI player<br />
with <strong>an</strong> omniscient view of the game. In a real game of<br />
<strong>Whist</strong> we c<strong>an</strong>not assume that we c<strong>an</strong> cheat <strong>an</strong>d see all cards.<br />
In order to remove the omniscience assumption we c<strong>an</strong><br />
take a probabilistic approach <strong>an</strong>d consider which card is<br />
most likely to lead to a favorable result. While a round is in<br />
progress we c<strong>an</strong> remember all of the cards that have been<br />
played. We also know the cards in our h<strong>an</strong>d <strong>an</strong>d the card<br />
that was used to determine the trump suit. This leads to a set<br />
of cards whose locations are currently unknown.<br />
A Monte-Carlo method c<strong>an</strong> then be used to arrive at <strong>an</strong><br />
educated guess as to which card will lead to a favorable<br />
result. The set of unknown cards are used to r<strong>an</strong>domly create<br />
the opponents' h<strong>an</strong>ds. The previously described search is<br />
used to determine the best card to play in this game configuration.<br />
This is repeated m<strong>an</strong>y times with different r<strong>an</strong>dom<br />
dealings of the opponent cards. A count is kept of how<br />
m<strong>an</strong>y times each of the AI player's cards were chosen as the<br />
best next move. When this process is <strong>com</strong>plete, the move to<br />
make is the card that was chosen the most times as the next<br />
best move.<br />
5.4 Search Depth vs. Deal Variations<br />
The quality of the move chosen is affected by the depth of<br />
each search <strong>an</strong>d the number of iterations in the Monte-Carlo<br />
simulation. Thinking time c<strong>an</strong> be allocated to either of these<br />
processes. In order to improve the quality of the move chosen,<br />
we need to consider how to allocate the thinking time.<br />
When there are m<strong>an</strong>y cards remaining in each h<strong>an</strong>d, the<br />
number of iterations is favored. In this case there are m<strong>an</strong>y<br />
more possible configurations of the opponent h<strong>an</strong>ds. This<br />
leads to a need <strong>for</strong> more samples to arrive at a reasonable<br />
result. Searching deeper is likely to be very expensive due to<br />
the high br<strong>an</strong>ching factor when there are m<strong>an</strong>y cards in each<br />
player's h<strong>an</strong>d. Also, the heuristic evaluations are likely to be<br />
fairly inaccurate since are m<strong>an</strong>y turns left in the game. Thus,<br />
searching more iterations is likely to be more valuable th<strong>an</strong><br />
searching deeper.<br />
When there are few cards remaining in each h<strong>an</strong>d, the<br />
search depth is favored. In this case there are fewer possible<br />
configurations of the opponent h<strong>an</strong>ds. This leads to fewer<br />
samples giving a similar coverage as the case with more<br />
cards <strong>an</strong>d more samples. Searching deeper will result in the<br />
search reaching the leaves more often. These positions are<br />
more valuable since the in<strong>for</strong>mation is based on a final<br />
game state <strong>an</strong>d not a heuristic estimate. Also, even when a<br />
leaf is not reached, the heuristic evaluations are more accurate<br />
since there are few turns left in the game.<br />
With these considerations in mind, the search depth <strong>an</strong>d<br />
iterations of the Monte-Carlo simulation are adjusted based<br />
on the number of cards remaining in each h<strong>an</strong>d.<br />
6 Evaluation Methodology<br />
Since the variation of <strong>Whist</strong> studied in this paper is not the<br />
official st<strong>an</strong>dard of <strong>Whist</strong>, there is no known existing pro-
gram that plays this game. Thus, a somewhat subjective<br />
method is needed to evaluate the effectiveness of the <strong>Whist</strong><br />
AI.<br />
The evaluation was per<strong>for</strong>med by having the AI play<br />
games against a focus group of four hum<strong>an</strong> players. This<br />
group consisted of players that are proficient at playing the<br />
studied variation of <strong>Whist</strong>. Comments were taken from the<br />
hum<strong>an</strong> players concerning blunders or brilli<strong>an</strong>t moves made<br />
by the AI. The frequency of the AI winning rounds was also<br />
observed. The reasoning behind the AI's moves was observed<br />
by <strong>an</strong>alyzing a report generated by the AI at each<br />
move. This report contained evaluations of each possible<br />
next move <strong>an</strong>d evaluations of each opponent's position. This<br />
gives some insight into the reasoning behind the AI's moves<br />
<strong>an</strong>d whether it is making some decisions based on faulty<br />
inferences.<br />
7 Results<br />
Play against the focus group suggests that the AI plays at<br />
the level of <strong>an</strong> average player. The AI plays acceptably well,<br />
but does not dominate the game <strong>an</strong>d win consistently. However,<br />
it also does not fall hopelessly behind in score.<br />
The test trials were possibly effected by opponent bias<br />
against the AI player. It is possible that the hum<strong>an</strong> players<br />
were focused on defeating the AI player <strong>an</strong>d stopping its<br />
bids. In this case the AI player may have been hindered by<br />
not being par<strong>an</strong>oid enough with respect to the opponent<br />
players' actions.<br />
8 Possible Improvements<br />
There are a number of different aspects that c<strong>an</strong> be improved<br />
to increase the AI player's strength.<br />
8.1 Heuristic Improvements<br />
The heuristic evaluation c<strong>an</strong> be possibly improved in a number<br />
of ways. The weights of the various card categories c<strong>an</strong><br />
be adjusted as previously described. The neutral card category<br />
could be divided into cards that are neutral winners <strong>an</strong>d<br />
neutral losers. This c<strong>an</strong> be determined by considering such<br />
things as a card's r<strong>an</strong>k relative to other remaining cards of<br />
the same suit, or how m<strong>an</strong>y cards of the same suit have already<br />
been played.<br />
Protected high cards c<strong>an</strong> also be considered. If a player<br />
has low cards of the same suit as a high card, the high card<br />
is not as d<strong>an</strong>gerous when trying to avoid going over a bid.<br />
This is because the player c<strong>an</strong> use the low card when <strong>for</strong>ced<br />
to play the suit <strong>an</strong>d try to get rid of the high card at a later<br />
point when the suit is not led.<br />
Improvements in the heuristic c<strong>an</strong> lead to more intelligent<br />
choices of moves.<br />
8.2 Intelligent Dealing of Opponent H<strong>an</strong>ds<br />
The opponent h<strong>an</strong>ds c<strong>an</strong> possibly be dealt with more intelligence<br />
th<strong>an</strong> just a r<strong>an</strong>dom deal. The cards c<strong>an</strong> be dealt to<br />
match the bids that were made by the opponents. However,<br />
there needs to be care taken to consider that <strong>an</strong> opponent's<br />
tricks remaining c<strong>an</strong> be incorrect if one of his winner or<br />
loser cards gave the opposite result. This will result in the<br />
number of tricks remaining to be incorrect <strong>for</strong> the opponent<br />
player.<br />
It is possible to adjust <strong>for</strong> this by observing the cards<br />
played <strong>an</strong>d attempting to predict when <strong>an</strong> opponent gets the<br />
wrong result <strong>for</strong> a card played. However, this is quite a<br />
<strong>com</strong>plicated endeavor that time restrictions made impractical.<br />
8.3 Automated Bidding<br />
The issue of automated bidding was considered. A possible<br />
algorithm is to consider each bid possibility as a separate<br />
search <strong>an</strong>d get the backed up value at the root <strong>for</strong> each of<br />
these searches. The best bid is then the bid made in the<br />
search with the best backed up value.<br />
This approach was attempted but the results were not<br />
very effective. The AI tended to make bids that were not<br />
very reasonable <strong>for</strong> the cards in its h<strong>an</strong>d. This is the result of<br />
not considering other import<strong>an</strong>t factors when making a bid.<br />
These other factors could include low cards protecting high<br />
cards, long suits, short suits, ability to give or take the lead.<br />
These are considerations that affect <strong>an</strong> intelligent bid but are<br />
not considered in the heuristic evaluation function.<br />
In order to focus on the AI's card play ability, I chose to<br />
m<strong>an</strong>ually enter the AI player's bid. This helps to remove the<br />
concern of bid quality affecting the card play results. This is<br />
not too unreasonable <strong>for</strong> <strong>an</strong> evaluation method based on<br />
play with hum<strong>an</strong>s since there is not a need to quickly make<br />
m<strong>an</strong>y bids.<br />
9 Conclusions<br />
A somewhat respectable AI <strong>Whist</strong> player was produced. It is<br />
not a dominating player, but it plays reasonably well <strong>an</strong>d<br />
leaves much room <strong>for</strong> future improvements. The card classification<br />
system appeared to be effective <strong>for</strong> <strong>Whist</strong>. This<br />
suggests possible applications in other trick taking games.<br />
The method of peeking at opponent cards <strong>com</strong>bined with the<br />
Monte-Carlo simulation appeared to be effective <strong>for</strong> <strong>Whist</strong>.<br />
This suggests possible applications in other imperfect in<strong>for</strong>mation<br />
card games.<br />
References<br />
[1] Wikipedia, "Oh Hell,"<br />
http://en.wikipedia.org/wiki/Oh_Hell<br />
[2] Ginsberg, M. L., "GIB research page,"<br />
http://www.cirl.uoregon.edu/ginsberg/gibresearch.html<br />
[3] Luckhardt, C. A., <strong>an</strong>d Ir<strong>an</strong>i, K. B., An algorithmic solution<br />
of N-person games, Proceeding of the National<br />
Conference on Aritificial <strong>Intelligence</strong> (AAAI-86), Philadelphia,<br />
PA, August, 1986, pp. 158-162