Macro Theory B Dynamic Programming 1 Dynamic optimization with ...

Macro Theory B 

Dynamic Programming 

Ofer Setty ∗ 

The Eitan Berglas School of Economics 

Tel Aviv University 

March 7, 2013 

1 Dynamic optimization with finite horizon 

The economy has a social planner whose horizon is finite, T. 

The planner’s problem is: 

Max {ct,kt+1 } T U(c ∑ 

0, c 1 ...c T ) = T β t u(c t ), 0 < β < 1, 

t=0 

t=0 

subject to c t + k t+1 ≤ f(k t ) + (1 − δ)k t , 

k 0 given, 

c t ≥ 0, t = 0, 1, ..., T, 

k t+1 ≥ 0, t = 0, 1, ..., T. 

We assume: 

∗ These notes are heavily based on the notes used by Zvi Hercowitz in Semester Bet 2010. I thank Zvi 

for allowing me to use those notes. 

1

1. Time-separability of preferences - so, for example, there are no habits or durable 

goods. 

2. u 1 (c) > 0, u 11 < 0, lim c→∞ u 1 (c) = 0, lim c→0 u 1 (c) = ∞. The latter assumption 

implies that c will never be zero. Hence the non-negativity constraint for consumption 

will never bind and can be omitted. 

3. f 1 (k) > 0, f 11 < 0, lim k→∞ f 1 (k) = 0, lim k→0 f 1 (k) = ∞. The latter assumption 

implies that k t > 0, t = 1, ...T, and hence the restriction for t = 1, ..T − 1 can be omitted. 

However, we still need the restriction k T +1 ≥ 0, otherwise, the planner will let k T +1 go to 

−∞. We expect k T +1 to be 0. 

4. There are no financial assets, only physical investment. This corresponds to a 

planning problem in a closed economy. 

The Lagrangian function is: 

∑ 

L = T β t {u(c t ) + λ t [f(k t ) + (1 − δ)k t − c t − k t+1 ]} + µ T β T k T +1 , 

t=0 

See below for the interpretation of µ T . 

The first-order conditions are: 

∂L 

= β t [u 1 (c t ) − λ t ] = 0, 

∂c t 

t = 0, 1, ..., T, (1) 

∂L 

= −β t λ t + β t+1 λ t+1 [f 1 (k t+1 ) + 1 − δ] = 0, 

∂k t+1 

t = 0, 1..., T − 1, (2) 

∂L 

= β T (−λ T + µ 

∂k T ) = 0. 

T +1 

(3) 

Optimality conditions formulated as Kuhn-Tucker conditions. These are a generalization 

of the Lagrange multipliers method for nonlinear restrictions which may not bind. 

2

The Kuhn-Tucker conditions are the conditions above plus: 

λ t [f(k t ) + (1 − δ)k t − c t − k t+1 ] = 0, t = 0, 1, ..., T, 

λ t ≥ 0, t = 0, 1, ..., T, 

µ T β T k T +1 = 0, 

µ T ≥ 0. 

Discussion of the solution: Given that 

u 1 (c t ) > 0, t = 0, 1, ..., T, 

λ t > 0, 

t = 0, 1, ..., T 

from (1). This implies two things: (1) the resource constraints bind at all times, and (2) 

from (3), µ T > 0, which in turn implies that k T +1 = 0. Using (1) and (2), and replacing 

(1) into (2), the dynamic behavior is the solution to the system 

0 = −u 1 (f(k t ) + (1 − δ)k t − k t+1 ) 

+ βu 1 (f(k t+1 ) + (1 − δ)k t+1 − k t+2 ) [f 1 (k t+1 ) + 1 − δ] , 

for t = 0, 1, ..., T − 1, 

k 0 = given, 

0 = β T µ T k T +1 

The first is called the “Euler Equation,”the second is the initial condition, and the third 

is the “transversality”condition, or final condition. 

3

Interpretation of µ T : In general the constraint would be: 

k T +1 − χ ≥ 0, χ ≥ 0. 

Hence, µ T represents the marginal utility of lowering χ. I.e., how much we lose if we are 

required to leave some capital behind. 

The system is a second-order difference equation in k. These are T + 2 equations 

with T + 2 unknowns: k 0 , k 1 , ...k T +1 . The two equations for the initial and the terminal 

conditions directly solve for 2 of the unknowns, remaining the T Euler equations with T 

unknowns. 

Interpretation of the Euler equation: 

u 1 (c t ) = βu 1 (c t+1 ) [f 1 (k t+1 ) + 1 − δ] . 

Marginal utility cost of investment– in terms of forgone utility from consumption– equals 

discounted value of the physical return next period translated in utility terms. 

2 Dynamic optimization with infinite horizon 

The main reason for considering the infinite horizon is intergenerational altruism: parents 

care about their children, and know that they will care about their children, and so on. 

Max {ct,kt+1 } ∞ = ∑ ∞ β t u(c 

t=0 t ), 0 < β < 1, 

t=0 

subject to c t + k t+1 ≤ f(k t ) + (1 − δ)k t , 

k 0 given, 

lim 

T →∞ βT u 1 (c T )k T = 0. 

4

The last equation is the “transversality condition”; a counterpart to the final condition 

in the finite-horizon case. In that case the final condition was β T u 1 (c T ) k T +1 = 0. 

The solution to the problem is the decision rule 

k t+1 = g (k t ) , 

An optimal decision rule is a mapping of a state to decisions that maximizes some function. 

I.e., the optimal choice for any possible value of the state. 

We need to compute g(·) which solves the optimization problem. 

The Euler equation, as seen previously, is 

−u 1 (f (k) + (1 − δ)k − k ′ ) + βu 1 (f (k ′ ) + (1 − δ)k ′ − k ′′ ) (f 1 (k ′ ) + (1 − δ)) = 0. 

A prime indicates one period ahead, and a double prime two periods ahead. 

As mentioned earlier, this is a second order non-linear difference equation in k. Because 

of the infinite number of periods, the solution method used previously, solution of a set 

of equations with an equal number of unknowns, is not applicable now. Here, we solve 

the problem using a Dynamic Programming procedure. 

2.1 The Bellman equation 

Solution of the problem (decision rule): 

k ′ = g (k) . 

Infinite-horizon Dynamic Programming logic: 

First, because of the infinite horizon, the planning horizon is constant over time, and 

5

hence the decision rule is also constant over time. In particular, the decision rule this 

period is the same as the decision rule next period. 

Second. The feature above turns the dynamic problem– where the current decision 

depends on the future decision– into a functional equation in one functional unknown. 

Define the “value”function V (·) as the solution to our problem given the initial condition: 

∞∑ 

V (k 0 ) = Max β t u(f(k t ) + (1 − δ)k t − k t+1 ). 

t=0 

The Bellman (Richard E.Bellman (1920—1984)) functional equation is 

V (k) = Max k ′ [u(f (k) + (1 − δ)k − k ′ ) + βV (k ′ )] . (4) 

V (k) is the solution to this equation. The first-order condition is: 

− u 1 (f (k) + (1 − δ)k − k ′ ) + βV 1 (k ′ ) = 0. (5) 

This equation is implicitly k ′ = g (k) . If we knew the function V we could solve for g. 

The only problem is that we’ll know V only when the problem is solved. 

Note that V (k 0 ) is the value function and that V (k) = Max k ′ [u(f (k) + (1 − δ)k − k ′ ) + βV (k ′ )] 

is the Bellman equation. Do not mix the two. 

2.2 The Envelope theorem and first order conditions 

Note that (5) should be identical to the original Euler equation. This can be shown as 

follows: 

Given (4), (the envelope theorem), V 1 (k) becomes 

V 1 (k) = u 1 (c) [f 1 (k) + (1 − δ)] 

6

Iterating forward and substituting into (5) yields: 

−u 1 (c) + βu 1 (c ′ ) [f 1 (k ′ ) + (1 − δ)] = 0, 

which is the Euler equation in the original form. 

2.3 Solution methods 

2.3.1 Value function iteration 

If we have no educated guess for the functional form of the value function then we start 

with any value. Under some regularity conditions a contraction mapping is guaranteed 

and we will end up with the optimal value. 

Practically, to solve the functional equation we conjecture a form for the function 

V (k), say V 1 (k), compute V 1 

1 (k ′ ), substitute into (5) 

This is one equation in k ′ . The solution: 

−u 1 (f (k) + (1 − δ)k − k ′ ) + βV 1 

1 (k ′ ) = 0 

k ′ = g 1 (k) . 

Substitute this into the Bellman equation to get the resulting value function: 

V 2 (k) = u(f (k) + (1 − δ)k − g 1 (k)) + βV 1 (g 1 (k)). 

If V 1 = V 2 , done. Otherwise, numerical methods can be used to approximate V by 

successive iterations until convergence. Note that convergence of V involves convergence 

of g. 

7

This example with δ = 1, f (k) = k a and u (c) = ln c can be actually solved by hand. 

V (k) = Max k ′ [ln(k a − k ′ ) + βV (k ′ )] . (6) 

Guess : V 1 (k) = 0. In this case choose optimally k ′ = 0. Then: 

V (k) = ln(k a ) = α ln (k) (7) 

Therefore: V 2 (k) = α ln (k) . 

V (k) = Max k ′ [ln(k a − k ′ ) + βα ln (k ′ )] . (8) 

The FOC is: 

1 

= βα 

k a − k ′ k ′ 

k ′ = βαk a − βαk ′ 

k ′ = 

βα 

1 + βα ka 

V 3 (k) = ln(k a − k ′ ) + βα ln (k ′ ) 

= ln(k a − βα 

( ) βα 

1 + βα ka ) + βα ln 

1 + βα ka 

( 

= ln (k a ) + ln 1 − βα ) 

( ) βα 

+ βα ln (k a ) + βα ln 

1 + βα 

1 + βα 

( ) ( ) 

1 

βα 

= α (1 + αβ) ln (k) + ln + βα ln 

1 + βα 

1 + βα 

(9) 

Continue till: 

V (k) = 

α 

1 αβ 

1 

ln (k) + ln (αβ) + ln (1 − αβ) (10) 

1 − αβ 1 − β 1 − αβ 1 − β 

8

2.3.2 Guess and verify 

Since we already know the solution we will guess the functional form and find the parameters: 

V (k) = E + F ln (k) 

V (k) = Max k ′ [ln(k a − k ′ ) + β (E + F ln (k ′ ))] 

From the FOC: 

k ′ = 

βF ka 

1 + βF 

Substitute into the Bellman equation to get: 

F = 

E = 

α 

1 − βα 

αβ 

ln (1 − βα) + ln (αβ) 

1−βα 

1 − β 

2.3.3 Policy function iteration (Euler equation solution) 

The Euler equation can be written as 

u 1 (f (k) + (1 − δ)k − k ′ ) = βu 1 (f (k ′ ) + (1 − δ)k ′ − g (k ′ )) [f 1 (k ′ ) + (1 − δ)] , 

which also an implicitly k ′ = g (k) . It also can be expressed as functional equation in g(.), 

similarly as the Bellman equation is in V (.): 

u 1 (f (k) + (1 − δ)k − g (k)) = βu 1 (f (k ′ ) + (1 − δ)k ′ − g (k ′ )) [f 1 (k ′ ) + (1 − δ)] . 

9

Practically, we conjecture the solution g 1 (k): 

u 1 (f (k) + (1 − δ)k − k ′ ) = βu 1 

( 

f (k ′ ) + (1 − δ)k ′ − g 1 (k ′ ) ) [f 1 (k ′ ) + (1 − δ)] 

This is one equation in k ′ . The solution is: 

k ′ = g 2 (k) 

If g 1 = g 2 , done. Otherwise, numerical methods can be used to approximate g by successive 

iterations until convergence. 

2.3.4 Euler-equation linearization 

The Dynamic Programming procedures above yield closed-form solutions for the value 

function and the decision rule only in a few special cases (one of these comes later). In 

general, there is no exact solution, and the solution is an approximation. One type of 

approximation is discretizing the state space and iterating numerically. Another one can 

be done by approximating the equations. Such approximation is obtained by: 

1. Linearization of the FOCs. This amounts to treating second derivatives as constant 

(third derivatives are assumed to be zero). This procedure is good for a range around 

steady state for which linearity of first derivatives is reasonable. 

2. Application of the DP method to the linearized model. 

The model is linearized around the steady state. Hence, the first step is to compute 

the steady state. The Euler equation and the resource constraint in this case have the 

form: 

10

1 

β = f 1 (k ∗ ) + 1 − δ, 

c ∗ = f (k ∗ ) − δk ∗ . 

Notes 

• Around the steady state c t = c t+1 = c ∗ , and similarly k t = k t+1 = k ∗ . 

• The first equation is implied by the Euler equation. We know that if βR = 1 then 

consumption is constant. Here we go in the opposite direction. 

These are two equations for k ∗ and c ∗ . Let’s define: 

u 1 ≡ u 1 (c ∗ ), 

u 11 ≡ u 11 (c ∗ ), 

f 1 ≡ f 1 (k ∗ ), 

f 11 ≡ f 11 (k ∗ ). 

We turn now to the linearization around the steady state of 

u 1 (c) − βu 1 (c ′ ) [f 1 (k ′ ) + (1 − δ)] = 0 

c = f (k) + (1 − δ)k − k ′ 

c ′ = f (k ′ ) + (1 − δ)k ′ − k ′′ 

Using the first order Taylor expansion, the FOC can be written as: 

11

u 11 dc − β [u 11 dc ′ (f 1 + 1 − δ) + u 1 f 11 dk ′ ] = 0, or 

u 11 dc − u 11 dc ′ − βu 1 f 11 dk ′ = 0, 

Using the same methodology, the resource constraint is: 

dc = (f 1 + 1 − δ) dk − dk ′ = 1 β dk − dk′ , 

The resource constraint iterated forward is: 

dc ′ = (f 1 + 1 − δ) dk ′ − dk ′′ = 1 β dk′ − dk ′′ . 

Substituting dc, dc ′ into the Euler equation yields: 

( 

1 

u 11 

β dk − u 11 1 + 1 β + β u ) 

1 

f 11 dk ′ + u 11 dk ′′ = 0, 

u 11 

where we used (β(f 1 + 1 − δ) = 1) . 

Dividing through by u 11 k, and defining ˆx ≡ dx/x : 

( 

1 

β ˆk − 1 + 1 β + β u ) 

1 

f 11 

ˆk ′ + 

u ˆk ′′ = 0. (11) 

11 

As the original non-linear equation, this is a second-order difference equation in k. 

The procedure is to solve it by the method of undetermined coeffi cients, subject to the 

transversality condition. 

12

Macro Theory B Dynamic Programming 1 Dynamic optimization with ...

Create successful ePaper yourself

Delete template?

Save as template?