Developing an Artificial Intelligence for Whist - Fongboy.com

Developing an Artificial Intelligence for Whist 

Abstract 

This paper details the development of a simple artificial 

intelligence for playing the card game Whist. 

A brief introduction to the game is given, and details 

are given for the design decisions that were 

made in the design of the AI. The results have some 

possible applications toward other similar games. 

1 Introduction 

Whist is an old card game that has been falling into obscurity 

due to the popularity of the game Bridge. However, 

Whist is still an interesting game to study because of its 

differences from Bridge. Whist is a true multiplayer game 

with more than 2 players. Also, bid values are to be met 

exactly so there are extra complications for avoiding going 

over a bid. 

There are many variations to the game Whist and currently 

there is no widely accepted official version of the 

game. The version studied in this paper is similar to the 

variation known as "Oh Hell!" [1]. 

Prior works in this area include the Ginsberg Intelligent 

Bridge Player (GIB) [2]. Although bridge is significantly 

different from Whist, the trick taking structure of Bridge 

makes some of the techniques used in the GIB to be also 

useful for Whist. 

2 Rules of Whist 

Since Whist is a game with many variations, there needs to 

be some explanation of the rules of the variation that is studied 

in this paper. 

2.1 General Game Structure 

This variation of Whist is a game played with five players 

with no partnerships. A set number of rounds are played in 

each game. Each player takes turns dealing, with the dealer 

position passing in clockwise order. The number of cards 

dealt to each player varies with each round. In the first 

round, each player is dealt ten cards. In each subsequent 

round, one less card is dealt to each player. This continues 

down to one card and then repeats from one card back to ten 

cards. After all players' cards are dealt, the next card in the 

deck is turned face up. The suit of this card determines the 

Jason Fong 

University of California, Los Angeles 

Computer Science 

jfong@cs.ucla.edu 

trump suit for the round. Each player receives a score at the 

end of each round. The winner is the player with the highest 

cumulative score at the end of the 20 rounds. 

2.2 Turn Structure 

The rules of play in each turn are similar to other trick taking 

games such as Bridge and Spades. In a turn, each player 

plays one card, going in clockwise order. The first player in 

a turn may play any card in his hand. This card determines 

the lead suit and the subsequent players must play a card of 

the same suit if possible. If a player does not have a card of 

the same suit, then any card may be played. 

The player that played the highest card of the same suit as 

the lead suit wins the trick. However, if a card of the trump 

suit was played, then the highest card of the trump suit will 

take precedence and win the trick. The winner of a trick 

plays the first card of the next trick. 

2.3 Bidding 

After the cards for a round are dealt and before play begins, 

each player looks at his cards and makes a bid declaring 

how many tricks they will take in the round. The goal of 

each round is to take the exact number of tricks that were 

bid. Taking too few or too many tricks are both considered a 

failure to meet the bid. 

The sum of all bids is not permitted to equal the number 

of cards dealt to each player in the current round. The effect 

of this is to ensure that at least one player will not be able to 

meet their bid. 

Bidding starts with the person following the dealer and 

proceeds in clockwise order. Since the dealer is the last to 

bid, his bid may be restricted because his bid cannot make 

the sum of the bids equal the cards dealt. After the dealer 

bids, he plays the first card of the first trick. 

2.4 Scoring 

At the end of a round, each player's score is calculated and 

added to their cumulative score. A player receives one point 

for each trick taken. If the number of tricks taken is exactly 

equal to the number bid for, then the player receives an additional 

ten points. There is no extra bonus on top of the ten 

points for making a bid for zero tricks.

From this scoring structure we see that making or missing 

a bid is the most important component of the final score. 

This implies that stopping other players from making their 

bids will factor into an effective strategy. Also, the ability of 

an opponent to stop a player from making his bid needs to 

be considered. 

3 Designing a Search for Whist 

We can design a Whist AI as a search on a tree. Each node 

represents a particular play order. Internal nodes of the tree 

are order of plays for partial games. Leaf nodes are order of 

plays for complete games. A search would attempt to find 

the child that is most likely to lead to a leaf with a favorable 

result for the player to move. This child represents the next 

move to make for the current player. 

3.1 A Complete Search 

At a particular point in a round there are a set of cards that 

are known and a set of cards that are unknown. The known 

set consists of the cards whose location is known. These are 

the cards that either in your hand, the trump-determining 

card, or cards that have already been played. The unknown 

set consists of the cards whose location is unknown. These 

are the cards that are in the hands of the opponent players. 

A complete search would consider an opponent to be able 

to play any unknown card. This method will completely 

search all of the possible outcomes of the game since it covers 

all of the possible configurations of the opponents' 

hands. However, the cost for this search is very expensive. 

The worse case for this search occurs in the rounds with 

10 cards dealt to each player. At the level where the first 

opponent is to move there will be a branching factor of 41. 

This is the number of unknown cards since there are 10 

known cards in the AI player's hand and one known card in 

the trump-determining card. In each subsequent opponent 

turn, the branching factor will reduce by one since the number 

of unknown cards will decrease by one whenever an 

opponent makes a move. 

An approximate number of leaves in this search tree can 

be calculated as 

56 

41! × 10! 

≈ 10 

(1) 

where 41! is the number of permutations for the order of 

play of the opponents' cards, and 10! is number of permutations 

for the order of play of the AI player's cards. This results 

in a very large search tree, so it is desirable to reduce 

this if possible. 

3.2 A Simplified Search 

The size of the search tree can be reduced if the branching 

factor at the opponents' moves could be reduced. The 

branching factor is high because at each opponent's turn, it 

is assumed that any unseen card can be played. The number 

of possible moves can be significantly reduced if the cards 

in the opponents' hands can be seen. 

In order reduce the size of the search, we will make a 

simplifying assumption and assume that the AI player is 

omniscient and can see all of the cards in all the hands. This 

will significantly reduce the branching factor of the search 

and allow the search to reach a greater depth within a reasonable 

amount of time. In section 5.3 we will consider how 

to remove the omniscience assumption. 

With this assumption in place, the number of leaves in the 

search tree can be calculated as: 

5 

32 

( 10 ! ) ≈ 6× 

10 

(2) 

where 10! is the number of permutations for the order of 

play of each player's cards, raised to the 5 th power since 

there are five players. This analysis is not entirely correct 

since at each turn players must follow the lead suit and cannot 

truly play any card in their hand. However, the equation 

in (2) simplifies this analysis and can be considered a loose 

upper bound on the number of leaves. 

The resulting search tree is significantly smaller than the 

tree in the complete search. However, this is still a very 

large tree and cannot be searched completely. In order to 

make a decision for the next move in a reasonable amount 

of time, we can cut off the search at a particular depth and 

use a heuristic evaluation function. 

4 Designing a Heuristic Evaluation 

In order to design a heuristic evaluation function, we need to 

determine the factors that contribute to the value of a particular 

game position. Since the goal of Whist is to end with 

the highest score, we need to consider factors that contribute 

to earning points. 

The biggest contributor to a player's score is the 10 points 

that are earned when a bid is successfully met. Thus, the 

primary component of the value of the heuristic evaluation 

should be an estimate of the likelihood of a player making 

his bid. An estimate of the number of tricks that will be taken 

compared to the player's bid value will give an estimate 

of the likelihood of making a bid. 

4.1 Card Classifications 

In order to estimate how many tricks will be taken, we can 

consider the cards held in a hand and classify them according 

to their ability to win or lose tricks. The ability to lose 

tricks needs to be considered since there will be situations 

where a player needs to avoid going over his bid. 

Conveniently, the omniscience assumption also makes it 

easier to classify the cards in a hand. Since we can see all 

cards, we can easily determine if a card will be guaranteed 

to win or to lose a trick if it is played as the lead card. The 

card is analyzed for when it is the lead card since considering 

the card in the general case for any play position will 

result in card classifications that are too narrowly defined. 

For instance, a guaranteed winner in the general case will 

usually only be the highest trump card. This is because even 

if a player has the highest card of a non-trump suit, it cannot 

win if that card's suit was not led. The classification being 

based on lead cards is not too restrictive since the classification 

also applies to the case where the card is played later in 

a trick, but the lead suit matches the card's suit. 

The cards are classified into five categories. The first two 

categories are the aforementioned guaranteed winners and

guaranteed losers. The next two categories are likely winners 

and likely losers. These are similar to the guaranteed 

winners and losers, except that the opponents' intentions are 

taken into consideration. If an opponent is able to win a 

trick but is not forced to win it, he may not actually decide 

to take the trick if he is in danger of going over his bid. 

Similarly, if an opponent is able to avoid winning a trick but 

is not forced to do so, he may still decide to take the trick if 

he needs more tricks to make his bid. 

The likely winner category is a looser form of the guaranteed 

winner category where a card is considered a winner if 

no player that may intend to take a trick can beat the card. 

The likely loser category is similarly related to the guaranteed 

loser category. An opponent's intentions are determined 

by considering how many tricks are needed to made his bid 

and the number of "guaranteed" category of cards. The 

"likely" category of cards is not considered since this would 

lead to an infinite recursion. 

The fifth category contains the neutral cards. These include 

all the cards that are not part of the first four categories. 

These cards are not neutral in the sense that they do not 

contribute at all to the ability to successfully make a bid. 

They are neutral in the sense that they are flexible and can 

be used for either winning trick or losing a trick. They are 

flexible because they are not dominating winner cards or 

weak loser cards. Thus, depending on when the player decides 

to use these cards, they can be put to either use. 

4.2 Calculating a Heuristic Value 

A heuristic value can be obtained by estimating the resulting 

score for the round. The first component of this is an estimate 

of the likelihood of making a bid. This can be obtained 

by using the card classifications to estimate the resulting 

number of tricks over or under the bid. 

We begin with the tricks left to make the bid. The number 

of guaranteed winner cards is subtracted from this. Next, the 

number of likely winner cards multiplied by 0.75 is subtracted. 

Not all of the likely winner cards will win, so only 

some fraction of those cards will be considered as tricks 

won. 

If the resulting value is positive, then there is probably a 

need to take additional tricks to make the bid. In this case, 

the number of neutral cards multiplied by 0.25 is subtracted. 

If the previous result was negative, then there is probably 

not a need to take additional tricks since there is a danger of 

going over the bid. Some small portion of the neutral cards 

still needs to be considered since opponents players are 

likely to try to force the taking of additional tricks when one 

is in danger of going over a bid. In this case, the number of 

neutral cards multiplied by 0.1 is subtracted. 

This results in the following equations: 

t′ r = b − t − cgw 

− 0. 

75clw 

(3) 

⎧t′ 

′ 

r − 0. 

25cn 

if tr 

> 0 

t r = ⎨ 

(4) 

⎩ t′ 

− 0. 

1 ′ 

r cn 

if tr 

< 0 

where b is the bid, t is the number of tricks already taken, 

cgw is the number of guaranteed winners, clw is the number 

of likely winners, cn is the number of neutral cards, t'r is the 

number of tricks remaining to take to make a bid after guaranteed 

winners and some portion of the likely winners are 

taken, and tr is the number of tricks remaining to be taken 

after consider all card categories. 

Equations (3) and (4) did not use the number of guaranteed 

loser and likely loser cards in the calculation. This is 

because the tricks lost will not change the number of tricks 

needed to make a bid. The purpose of the guaranteed loser 

and likely loser categories is to take away cards from the 

other categories that do factor into the calculation. 

The expected value of points earned for making a bid is 

estimated as: 

ch 

− tr 

pb 

= 10 ⋅ 

(5) 

ch 

where ch is the number of cards remaining the hand. This 

formulation results in 10 points for meeting a bid exactly, 

and reduces the value with each trick over or under the bid. 

The number of tricks over or under is weighed against the 

cards remaining to play since there is more flexibility in 

play when there are more cards to choose from. This flexibility 

makes it more likely that an unfavorable outcome can 

be avoided. Thus, the penalty to the estimated value of the 

bid points is reduced when there are more cards remaining 

to play. 

In addition to the estimated points from making a bid, 

there are some additional minor components to consider. 

One point is added for each trick taken. One point is also 

added for each opponent who is likely to miss a bid. An 

opponent is considered to be likely to miss a bid if the opponent's 

tr value is less than -2. The choice of this value is a 

bit arbitrary. It was chosen since it is usually hard to recover 

from a position where one is likely to take two tricks over 

the bid. 

Taking these extra considerations results in the final heuristic 

value can be obtain as follows: 

pt = b − tr 

(6) 

h = pb 

+ pt 

+ m 

(7) 

where pt is the estimated number of points from tricks taken, 

m is the estimated number of opponents that will miss their 

bids, and h is the final heuristic value. 

4.3 Choosing Card Category Weights 

The weights of the card categories were obtained by testing 

various weights in a few sample games. The values chosen 

for consideration were based on what I thought was reasonable 

for the outcomes of each card type. However, these 

choices are still fairly arbitrary. 

A more rigorous approach would create multiple instances 

of the Whist AI with different category weights and 

allow the instances to compete against each other. This 

would provide more empirical evidence as to which weights 

are more effective. However, this may not give results that 

actually work well against human players since the play 

against AI opponents may be biased toward idiosyncrasies

of the AI's style of play. This approach toward obtaining the 

card category weights was not taken due to time constraints. 

5 Determining the Next Move 

The next move is determined by performing a search on the 

game tree and obtaining values for each of the children of 

the root. The children of the root node represent each of the 

possible cards that can be played for the next move. 

5.1 Search Methodology 

The search is performed by using the Max n algorithm [3]. 

This is similar to a minimax search, but adapted for handling 

more than two players. In a game with n players, the 

evaluation of each node gives an n-tuple. The n-tuple is a 

five element array of values where each element corresponds 

to the evaluation of the node from the viewpoint of 

each player. Each player's goal is to maximize the value of 

his element of the tuple. 

When a node is evaluated, each of the children's values 

are considered and the entire tuple of the child with the 

highest evaluation for the current player is backed up to the 

current node. The child values are obtained by the previously 

described heuristic evaluation function if the search 

did not reach the leaves. If the leaves are reached, then the 

child value is obtained by calculating the actual score for the 

current player. One point is given for each trick taken and 

ten points are given for making the player's bid. In order to 

adjust for the desire to make opponents miss their bids, an 

additional point is given for each opponent player that fails 

to make his bid. 

After the Max n search is performed to a fixed depth, the 

root node will have backed up values for its children. These 

values represent the estimated value of each of the possible 

moves that can be made. The card that should be played 

next is the card that is played in the child node with the 

highest backed up value. 

5.2 Search Optimization Considerations 

As a round of Whist progresses, the branching factor of the 

search tree decreases. This is due to the fact that as cards are 

played, the players have fewer cards in their hands to chose 

from. This reduction in possible moves reduces the branching 

factor of the tree. Thus, for a given amount of time, we 

can usually search deeper when there are fewer cards in the 

players' hands. This is taken advantage of by adapting the 

search depth to the number of cards in the players' hands. 

During the last trick each player has only one card in their 

hand. This results in forced moves, so there is no need to 

search at the bottom five levels of the search tree. When the 

search reaches a node of height six, the last five moves are 

played and the final game state is evaluated at the leaf node. 

This value is then backed up to the height six node. 

In some situations, a player may have only one legal 

move. This can be the case when a player has only one card 

of the lead suit. When this occurs, the search need not be 

performed since there is no choice of move. 

5.3 Removing the Omniscience Assumption 

The preceding analysis on heuristics and search methodology 

has been made under the assumption of an AI player 

with an omniscient view of the game. In a real game of 

Whist we cannot assume that we can cheat and see all cards. 

In order to remove the omniscience assumption we can 

take a probabilistic approach and consider which card is 

most likely to lead to a favorable result. While a round is in 

progress we can remember all of the cards that have been 

played. We also know the cards in our hand and the card 

that was used to determine the trump suit. This leads to a set 

of cards whose locations are currently unknown. 

A Monte-Carlo method can then be used to arrive at an 

educated guess as to which card will lead to a favorable 

result. The set of unknown cards are used to randomly create 

the opponents' hands. The previously described search is 

used to determine the best card to play in this game configuration. 

This is repeated many times with different random 

dealings of the opponent cards. A count is kept of how 

many times each of the AI player's cards were chosen as the 

best next move. When this process is complete, the move to 

make is the card that was chosen the most times as the next 

best move. 

5.4 Search Depth vs. Deal Variations 

The quality of the move chosen is affected by the depth of 

each search and the number of iterations in the Monte-Carlo 

simulation. Thinking time can be allocated to either of these 

processes. In order to improve the quality of the move chosen, 

we need to consider how to allocate the thinking time. 

When there are many cards remaining in each hand, the 

number of iterations is favored. In this case there are many 

more possible configurations of the opponent hands. This 

leads to a need for more samples to arrive at a reasonable 

result. Searching deeper is likely to be very expensive due to 

the high branching factor when there are many cards in each 

player's hand. Also, the heuristic evaluations are likely to be 

fairly inaccurate since are many turns left in the game. Thus, 

searching more iterations is likely to be more valuable than 

searching deeper. 

When there are few cards remaining in each hand, the 

search depth is favored. In this case there are fewer possible 

configurations of the opponent hands. This leads to fewer 

samples giving a similar coverage as the case with more 

cards and more samples. Searching deeper will result in the 

search reaching the leaves more often. These positions are 

more valuable since the information is based on a final 

game state and not a heuristic estimate. Also, even when a 

leaf is not reached, the heuristic evaluations are more accurate 

since there are few turns left in the game. 

With these considerations in mind, the search depth and 

iterations of the Monte-Carlo simulation are adjusted based 

on the number of cards remaining in each hand. 

6 Evaluation Methodology 

Since the variation of Whist studied in this paper is not the 

official standard of Whist, there is no known existing pro-

gram that plays this game. Thus, a somewhat subjective 

method is needed to evaluate the effectiveness of the Whist 

AI. 

The evaluation was performed by having the AI play 

games against a focus group of four human players. This 

group consisted of players that are proficient at playing the 

studied variation of Whist. Comments were taken from the 

human players concerning blunders or brilliant moves made 

by the AI. The frequency of the AI winning rounds was also 

observed. The reasoning behind the AI's moves was observed 

by analyzing a report generated by the AI at each 

move. This report contained evaluations of each possible 

next move and evaluations of each opponent's position. This 

gives some insight into the reasoning behind the AI's moves 

and whether it is making some decisions based on faulty 

inferences. 

7 Results 

Play against the focus group suggests that the AI plays at 

the level of an average player. The AI plays acceptably well, 

but does not dominate the game and win consistently. However, 

it also does not fall hopelessly behind in score. 

The test trials were possibly effected by opponent bias 

against the AI player. It is possible that the human players 

were focused on defeating the AI player and stopping its 

bids. In this case the AI player may have been hindered by 

not being paranoid enough with respect to the opponent 

players' actions. 

8 Possible Improvements 

There are a number of different aspects that can be improved 

to increase the AI player's strength. 

8.1 Heuristic Improvements 

The heuristic evaluation can be possibly improved in a number 

of ways. The weights of the various card categories can 

be adjusted as previously described. The neutral card category 

could be divided into cards that are neutral winners and 

neutral losers. This can be determined by considering such 

things as a card's rank relative to other remaining cards of 

the same suit, or how many cards of the same suit have already 

been played. 

Protected high cards can also be considered. If a player 

has low cards of the same suit as a high card, the high card 

is not as dangerous when trying to avoid going over a bid. 

This is because the player can use the low card when forced 

to play the suit and try to get rid of the high card at a later 

point when the suit is not led. 

Improvements in the heuristic can lead to more intelligent 

choices of moves. 

8.2 Intelligent Dealing of Opponent Hands 

The opponent hands can possibly be dealt with more intelligence 

than just a random deal. The cards can be dealt to 

match the bids that were made by the opponents. However, 

there needs to be care taken to consider that an opponent's 

tricks remaining can be incorrect if one of his winner or 

loser cards gave the opposite result. This will result in the 

number of tricks remaining to be incorrect for the opponent 

player. 

It is possible to adjust for this by observing the cards 

played and attempting to predict when an opponent gets the 

wrong result for a card played. However, this is quite a 

complicated endeavor that time restrictions made impractical. 

8.3 Automated Bidding 

The issue of automated bidding was considered. A possible 

algorithm is to consider each bid possibility as a separate 

search and get the backed up value at the root for each of 

these searches. The best bid is then the bid made in the 

search with the best backed up value. 

This approach was attempted but the results were not 

very effective. The AI tended to make bids that were not 

very reasonable for the cards in its hand. This is the result of 

not considering other important factors when making a bid. 

These other factors could include low cards protecting high 

cards, long suits, short suits, ability to give or take the lead. 

These are considerations that affect an intelligent bid but are 

not considered in the heuristic evaluation function. 

In order to focus on the AI's card play ability, I chose to 

manually enter the AI player's bid. This helps to remove the 

concern of bid quality affecting the card play results. This is 

not too unreasonable for an evaluation method based on 

play with humans since there is not a need to quickly make 

many bids. 

9 Conclusions 

A somewhat respectable AI Whist player was produced. It is 

not a dominating player, but it plays reasonably well and 

leaves much room for future improvements. The card classification 

system appeared to be effective for Whist. This 

suggests possible applications in other trick taking games. 

The method of peeking at opponent cards combined with the 

Monte-Carlo simulation appeared to be effective for Whist. 

This suggests possible applications in other imperfect information 

card games. 

References 

[1] Wikipedia, "Oh Hell," 

http://en.wikipedia.org/wiki/Oh_Hell 

[2] Ginsberg, M. L., "GIB research page," 

http://www.cirl.uoregon.edu/ginsberg/gibresearch.html 

[3] Luckhardt, C. A., and Irani, K. B., An algorithmic solution 

of N-person games, Proceeding of the National 

Conference on Aritificial Intelligence (AAAI-86), Philadelphia, 

PA, August, 1986, pp. 158-162

Developing an Artificial Intelligence for Whist - Fongboy.com

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?