Lecture 17: Repeated Games -II 1 Folk Theorems for Infinitely ...

6.254: Game Theory with Engineering Applications April 15, 2008 

Lecturer: Asu Ozdaglar 

Lecture 17: Repeated Games -II 

In this lecture, we will continue our discussion of repeated games. In particular, we will 

cover: 

• Repeated games with perfect monitoring - Nash and Perfect Folk Theorems 

• Imperfect monitoring 

– Price trigger strategies 

Recall that we use v i to denote the minmax payoff of player i, i.e., 

v i = min max g i (α i , α −i ). 

α −i α i 

The strategy profile m i −i denotes the minmax strategy of opponents of player i and m i i 

denotes the best response of player i to m i −i, i.e., g i (m i i, m i −i) = v i . The set V ∗ ⊂ R I denotes 

the set of feasible and strictly individually rational payoffs. 

1 Folk Theorems for Infinitely Repeated Games 

The “folk theorems” for repeated games establish that if the players are patient enough, then 

any feasible strictly individually rational payoff can be obtained as equilibrium payoffs. We 

start with the Nash folk theorem which obtains feasible strictly individually rational payoffs 

as Nash equilibria of the repeated game. 

Theorem 1 (Nash folk theorem:) For all v = (v 1 , . . . , v I ) ∈ V ∗ , there exists some δ < 1 

such that for all δ ≥ δ, there is a Nash Equilibrium of the infinitely repeated game G ∞ (δ) 

with payoffs v. 

17-1

Proof: Assume first that there exists a pure action profile a = (a 1 , . . . , a I ) such that g i (a) = v i 

(otherwise, we need a public randomization argument, see Fudenberg and Tirole, Theorem 

5.1). 

Consider the following strategy for player i: 

I. Play a i in period 0 and continue to play a i so long as either the realized action in the 

previous period is a, or the realized action in the previous period differed from a in 

two or more components. If only one player deviates, go to II. 

II. If player j deviates, play m j i thereafter. 

1 

We next check if player i can gain by deviating form this strategy profile. If i plays the 

strategy, his payoff is v i . If i deviates from the strategy in some period t, then denoting 

v i = max a g i (a), the most that player i could get is given by: 

[ 

] 

(1 − δ) v i + δv i + · · · + δ t−1 v i + δ t v i + δ t+1 v i + δ t+2 v i + · · · . 

Hence, following the suggested strategy will be optimal if 

v i ≥ (1 − δ t )v i + δ t (1 − δ)v i + δ t+1 v i 

= v i − δ t [v i + (1 − δ)v i − δv i + (δv i − δv i )]. 

It can be seen that the expression in the bracket is non negative for any δ ≥ δ, where 

thus completing the proof. 

□ 

δ = max 

i 

v i − v i 

v i − v i 

, 

Nash folk theorem states that essentially any payoff can be obtained as a Nash Equilibrium 

when players are patient enough. 

However, the corresponding strategies involve 

unrelenting punishments, which may be very costly for the punisher to carry out (i.e., they 

represent non-credible threats). Hence, the strategies used are not subgame perfect. The 

next example illustrates this fact. 

1 Note that the strategy constructed here is a grim trigger strategy with punishment by minmax. 


Example: 

L (q) R (1 − q) 

U 6, 6 0, −100 

D 7, 1 0, −100 

The unique NE in this game is (D, L). It can also be seen that the minmax payoffs are 

given by 

v 1 = 0, v 2 = 1, 

and the minmax strategy profile of player 2 is to play R. 

Nash Folk Theorem says that (6,6) is possible as a Nash equilibrium payoff of the repeated 

game, but the strategies suggested in the proof require player 2 to play R in every period 

following a deviation. 

While this will hurt player 1, it will hurt player 2 a lot, it seems 

unreasonable to expect her to carry out the threat. 

Our next step is to get the payoff (6, 6) in the above example, or more generally, the set 

of feasible and strictly individually rational payoffs as subgame perfect equilibria payoffs of 

the repeated game. 

Perfect Folk Theorems: The first perfect folk theorem shows that any payoff above the 

static Nash payoffs can be sustained as a subgame perfect equilibrium of the repeated game. 

This weaker theorem is sometimes referred to as the Nash-threats folk theorem. 

Theorem 2 (Friedman) Let α ∗ be a static equilibrium of the stage game with payoffs e. 

Then for any feasible payoff v with v i > e i for all i ∈ I, there exists a subgame perfect 

equilibrium of G ∞ (δ) with payoffs v. 

The proof of this theorem involves constructing grim trigger strategies with punishment 

by the static Nash Equilibrium in phase II, thus satisfying sequential rationality in subgames 

where some player deviated. 

This theorem raises the questions: 

• What happens if the static NE payoff vector is strictly greater than minmax payoff 

vector (in some components)? 


• Does the requirement of subgame perfection restrict the set of equilibrium payoffs? 

The perfect folk theorem of Fudenberg and Maskin shows that this is not the case: For 

any feasible, individually rational payoff vector, there is a range of discount factors for which 

that payoff vector can be obtained as a subgame perfect equilibrium. 

Theorem 3 (Fudenberg and Maskin) Assume that the dimension of the set V of feasible 

payoffs is equal to the number of players I. Then, for any v ∈ V with v i > v i for all i, there 

exists a discount factor δ < 1 such that for all δ ≥ δ, there is a subgame perfect equilibrium 

of G ∞ (δ) with payoffs v. 

See Fudenberg and Tirole, Chapter 5, Theorem 5.4. 

Remarks: 

1. The strategies constructed to prove the above theorem involve punishments and rewards. 

2. The assumption on the dimension of V ensures that each player i can be singled out 

for punishment in the event of a deviation.This assumption could be weakened, but 

cannot be completely dispensed with (see Homework 4). 

3. Folk theorem for finitely repeated games: 

• If the stage game has a unique Nash equilibrium, then by backward induction, the 

unique perfect equilibrium is to play the static Nash equilibrium in every period. 

• With multiple stage Nash equilibria, one can use rewards & punishments to construct 

other subgame perfect equilibria (see Homework 4). 

4. Equilibrium concepts do very little to pin down the play by patient players! 


2 Repeated Games with Imperfect Monitoring 

In the previous section, we assumed that in the repeated play of the game, each player 

observes the actions of others at the end of each stage. Here, we consider the problem of 

repeated games where player’s actions may not be directly observable. We assume that 

players observe only an imperfect signal of the stage game actions. 

Motivational Example Cournot competition with noisy demand. (Green and Porter 84) 

Firms set output levels q1, t ..., qI t privately at time t. The level of demand is stochastic. Each 

firm’s payoff depends on his own output and on the publicly observed market price. Firms 

do not observe each other’s output levels. The market price depends on uncertain demand 

and the total output, i.e., a low market price could be due to high produced output or low 

demand. 

We will focus on games with public information: At the end of each period, all players 

observe a public outcome, which is correlated with the vector of stage game actions. Each 

player’s realized payoff depends only on his own action and the public outcome. 

2.1 Model 

We state the components of the model for a stage game with finite actions and a public 

outcome that can take finitely many values. The model extends to the general case with 

straightforward modifications. 

• Let A 1 , . . . , A I be finite action sets. 

• Let y denote the publicly observed outcome, which lie in a (finite) set Y . Each action 

profile a induces a probability distribution over y ∈ Y . Let π(y, a) denote the 

probability distribution of y under action profile a. 

• Player i realized payoff is given by r i (a i , y). Note that this payoff depends on the 

actions of the other players through their effect on the distribution of y. 


• Player i’s expected stage payoff is given by 

g i (a) = ∑ y∈Y 

π(y, a)r i (a i , y). 

• A mixed strategy is α i ∈ ∆(A i ). Payoffs are defined in the obvious way. 

• The public information at the start of period t is h t = (y 0 , . . . , y t−1 ). 

• We consider public strategies for player i, which is a sequence of maps s t i : h t → A i . 

• Player i’s average discounted payoff when the action sequence {a t } is played is given 

by 

(1 − δ) 

∞∑ 

δ t g i (a t ) 

2.2 Example: Noisy Prisoner’s Dilemma 

Public signal: p 

Actions: (a 1 ,a 2 ), where a i ∈ (C, D) 

Payoffs: 

Probability distribution for public signal p: 

a 1 = a 2 = C → p = X, 

a 1 ≠ a 2 → p = X − 2, 

a 1 = a 2 = D → p = X − 4, 

t=0 

r 1 (C, p) = 1 + p, r 1 (D, p) = 4 + p, 

r 2 (C, p) = 1 + p, r 2 (D, p) = 4 + p. 

where X is a continuous random variable with cumulative distribution function F (x) and 

E[X] = 0. The payoff matrix for this game takes the following form: 

C 

D 

C (1 + X, 1 + X) (−1 + X, 2 + X) 

D (2 + X, −1 + X) (X, X) 


2.2.1 Trigger-price Strategy 

Consider the following trigger type strategy for the noisy prisoner’s dilemma: 

• (I) - Play (C, C) until p ≤ p ∗ , then go to Phase II. 

• (II) - Play (D, D) for T periods, then go back to Phase I. 

Notice the strategy is stationary and symmetric. Also notice the punishment phase uses a 

static NE. We next show that we can choose p ∗ and T such that the proposed strategy profile 

is an SPE. 

Find the continuation payoffs: 

In Phase I, if players do not deviate, their expected payoff will be 

From this equation, we obtain 

[ 

] 

v = (1 − δ)(1 + 0) + δ F (p ∗ )δ T v + (1 − F (p ∗ ))v . 

v = 

If the player deviates, his payoff will be 

1 − δ 

1 − δ T +1 F (p ∗ ) − δ(1 − F (p ∗ )) . (1) 

v d = (1 − δ){(2 + 0) + δ[F (p ∗ + 2)δ T v + (1 − F (p ∗ + 2))v]} (2) 

Note that deviating provides immediate payoff, but increases the probability of entering 

Phase II. In order for the strategy to be an SPE, the expected difference in payoff from the 

deviation must not be positive. 

Incentive Compatibility Constraint: v ≥ v d 

v ≥ (1 − δ){2 + F (2 + p ∗ )δ T +1 v + [1 − F (2 + p ∗ )]δv} (3) 

Substituting v, we have 

1 

1 − δ T +1 F (p ∗ ) − δ(1 − F (p ∗ )) ≥ 2 

1 − F (2 + p ∗ )δ T +1 − δ[1 − F (2 + p ∗ )] 

(4) 


Any T and p ∗ that satisfy this constraint would construct an SPE. The best possible triggerprice 

equilibrium strategy could be found if we could maximize v subject to the incentive 

compatibility constraint. 

Question: Are there other equilibria with higher payoffs ? 

There’s an explicit characterization of the set of equilibrium payoffs using a multi-agent 

dynamic programming-type operator, by Abreu, Pearce and Stachetti (90). The set of 

equilibrium payoffs is a fixed point of this operator and the value iteration converges to the 

set of equilibrium payoffs. 

For more on this topic, see Fudenberg and Tirole section 5.5 and Abreu, Pearce and 

Stachetti (90). 

References 

[1] Abreu D., Pearce D., and Stachetti E., “Toward a theory of discounted repeated games 

with imperfect monitoring,” Econometrica, vol. 58, pp., 1041-1064, 1990. 

[2] Green E. and Porter R., “Noncooperative collusion under imperfect price information,” 

Econometrica, vol. 52, pp., 87-100, 1984.

Lecture 17: Repeated Games -II 1 Folk Theorems for Infinitely ...

Create successful ePaper yourself

Delete template?

Save as template?