Dynamic Contracts Under Loss Aversion - Universidad Adolfo IbaÃ±ez

**Dynamic** **Contracts** **Under** **Loss** **Aversion**Alejandro Jofre Sofia Moroni Andrea Repetto ∗September 25, 2012AbstractWe analyze a dynamic moral hazard principal-agent model with an agent whois loss averse and whose reference updates according to the previous period’sconsumption. **Under** full commitment and when the agent has no access to creditmarkets, the optimal payment scheme can have flat segments at the reference level.This property implies that there is a positive probability of observing constantwages over time, even though the scheme has memory. Moreover, the modelpredicts a “status quo bias” -a preference for consuming his full allocation ex postwheneverthe agent is allowed to borrow or save after the outcome is realized .This result in turn implies that unlike the canonical model, the optimal contractmay be implemented even when the agent has access to a saving technology.We also show that although the optimal contract scheme is renegotiation-proof,it cannot be implemented by a sequence of spot contracts. However, when theagent has access to the credit market and the principal can monitor his savings,the long-term optimal contract is spot contractible. Finally, to deal with nondifferentiabilities,we use subdifferential calculus to compute the optimal program.1 IntroductionIn this paper we modify the classical principal-agent model with moral hazard byassuming that the agent is loss averse to payments below the previous period’sconsumption. We analyze the optimal contracting problem both when the agenthas no access to credit markets and when he does have access to credit but theprincipal can monitor his savings.The dynamic moral hazard model under the classical risk aversion assumptions∗ **Universidad** de Chile, Yale University and **Universidad** **Adolfo** Ibanez, respectively. We havegreatly benefited from comments made by Hector Chade, Adam Kapor, Juan Pablo Torres-Martinez,Felipe Balmaceda and Nicolas Figueroa. We also thank seminar participants at University of Chile,**Universidad** **Adolfo** Ibanez, Central Bank of Chile, European University, International Conference onContinuous Optimization, Latin American Econometric Society Meetings, North American EconometricSociety Summer Meetings.1

has been extensively analyzed in the literature. The characteristics of this secondbest optimal scheme are developed in Rogerson (1985) and Chiappori et al.(1994). **Under** mild conditions, the restricted savings model predicts that theagent’s compensation is strictly increasing in the current period outcome. It alsopredicts that the contract exhibits memory; i.e., the agent’s compensation in oneperiod also depends on the outcomes of all prior periods. Furthermore, as theoptimal contract smoothes the expected (inverse) of marginal utility over time,the scheme offers compensations that are front-loaded. Intuitively, since the agentneeds to rely on the principal to transfer resources over time, the principal canreduce the cost of providing effective incentives by keeping the marginal utility ofconsumption low in earlier periods. In turn, this implies that if the agent wereex-post allowed to save or borrow, he would never choose to consume more thanthe current period’s compensation. If anything, he would choose to save. Finally,when the agent has no access to a savings technology, the long-term contractcannot be implemented by a sequence of spot contracts because of the memoryin-wagesproperty.**Under** monitored savings, the classical model predicts that the principal will controlsavings to provide intertemporal consumption smoothing, and will balanceinsurance and incentives through the payment scheme. Since the long-term contracthas now memory in consumption and not in wages, it is spot-implementable.Furthermore, the long-term contract is renegotiation proof.In this paper we review the robustness of these predictions to the introduction ofreference dependence and loss aversion in the agent’s utility function. **Loss** aversionwas first proposed by Kahneman and Tversky (1979) as an essential elementof their Prospect Theory. **Under** these preferences, the dislike that consumptionbelow the reference generates is greater than the elation produced by an equallysized gain. A large and growing body of literature in economics and in cognitivepsychology supports the hypothesis that references do affect individual decisions. 1In this paper, we study a dynamic set up in which the reference is updated endogenously.Similar to Bowman et al. (1999), Munro and Sugden (2003) andDittmann et al. (2010) we assume the agent’s reference is equal to the previousperiod’s consumption; that is, in addition to the standard consumption utility,he derives gain-loss utility from comparing current consumption with lagged consumption.2This new framework confirms many of the predictions of the classical dynamic1 Rabin (1998) and Della Vigna (2009) survey this literature.2 There is little evidence on how reference levels are determined. Our assumption is also similarto models of the endowment effect that posit that willingness to pay for a good depends on recentownership status. In a dynamic setup like ours, the alternative of assuming that the reference is basedupon expectations may lead to solutions in which the agent makes plans that will not be carried out(Koszegi and Rabin, 2009). It is worth emphasizing that our model is one of fully forward-lookingbehavior: both the agent and the principal take into account the effects of their behavior on futurereferences.2

moral hazard model. In particular, we find that the optimal long-term contractis non-decreasing in outcomes, displays memory in wages and provides consumptionsmoothing. Moreover, the full commitment optimum is ex-post efficient andtherefore it is renegotiation proof. We also show that in the monitorable access tocredit case the contract is renegotiation-proof, implementable via spot contractsand it coincides with the no-access optimum.Our main finding, however, is that the optimal contract we derive displays anumber of relevant new properties. Most important, these new features of thesecond best optimal scheme are better in line with observed contracts.First, optimal wage schedules may be insensitive to outcomes in an interval, offeringto pay the reference income for a set of performance measures. Intuitively, thecost of inducing effort by increasing payments right above the reference may behigh due to the sudden decrease in marginal utility. Similarly, although effectivein providing incentives, a reduction in payment just below the reference increasesthe cost of inducing participation, again due to the discontinuity in marginal utility.In other words, it is optimal to provide full insurance locally because lossaversion involves first-order risk aversion around the reference (Segal and Spivak,1990).Second, for all periods after the first, the optimal wage schedule pays the referencefor an interval of outcomes. Except for the last period, the flat segmentmay even extend for the whole support of the outcomes’ distribution. Thus incentivesmay be optimally provided not by rewards and punishments that arecontingent on the current period’s result but by the promise of future income.Third, the fact that the payment schemes exhibit flat segments in most periodsimplies, under weak assumptions, that there is a positive probability thattwo consecutive payments are equal; e.g., that wages exhibit time persistenceas reported by Dickens et al. (2007). In contrast, the classical model predictsvariability in observed wages whenever the outcomes’ distribution is continuous.Moreover, there is a positive probability that the agent perceives a fixed wageover T −1 periods; therefore all incentives must be deferred to the last period withschemes that are at least partly sensitive to outcomes in T , a prediction consistentwith the evidence in Baker, Jensen and Murphy (1988) and in Baker, Gibbs andHolmstrom (1994).Fourth, when the agent is allowed to save or borrow after the realization of theperformance measure, we find that the agent may either choose to save, borrow orconsume his wage. In the classical model the optimal payment scheme requires theagent to consume more than he would choose in order to facilitate the provisionof incentives in future periods. In our model, this effect is also present. However,there are other effects in play that may make the agent want to borrow or consumehis full allocation in some periods. On the one hand, saving and borrowingnot only modify the intertemporal allocation of consumption but also change the3

future references. On the other hand, changes in consumption may lead to currentor future losses, motivating the agent to choose to consume his full allocation.In other words, loss aversion implies that the marginal utility of savings maynot be equal to (minus) the marginal utility of borrowing. In fact, they may beboth simultaneously negative. Therefore, whenever the agent experiences a lossby either saving or borrowing, he will prefer to consume his full wage in a mannerconsistent with the “status quo bias” of Samuelson and Zeckhauser (1988). Moreover,whenever the agent is consuming his reference in one period or there is apositive probability of being paid the reference in the next, the smallest interestrate at which the agent would be willing to lend part of his income is strictlyhigher than the rate he would be willing to pay to borrow. That is, the modelpredicts a related phenomenon (Bateman et al., 1997): a discrepancy between thewillingness to accept and the willingness to pay.An important consequence of our analysis is that the optimal payment schemedoes not always require constrained savings. Many of the canonical model’s predictionshinge on the extreme assumption that the agent’s borrowing and savingare constrained, either because credit is not available to him or because the principalcan monitor his actions in the credit market. In particular, under these assumptions,the optimal long-term contract is renegotiation-proof. However, whenthe agent does have access to the credit market and his savings and borrowing areprivate information, the full commitment long-term optimum is not renegotiationproof. In fact, the renegotiation-proof long-term contract cannot provide incentivesto exert any effort above the minimum. Since it is unlikely that a court oflaw would prevent renegotiation towards a Pareto-improving agreement and sinceconstraining savings may be implausible in most contexts, the classical theorycannot explain the existence of long-term commitment contracts (Chiappori etal., 1994). Thus loss aversion and our assumed dynamic update of the referencemight give a rationale for the ubiquity of commitment contracts.Finally, we show that the sequence of optimal spot contracts is not memorylessas in the classical case and will in general not coincide with the full-commitmentoptimum.There exists a small but growing literature on moral hazard, optimal contractsand loss aversion. This literature intends to explain the gaps between the richcontracts predicted by theoretical developments and the fairly simple contractsthat are actually observed (Prendergast, 1999; Salanie, 2003). De Meza and Webb(2007) first introduced loss aversion to the static principal agent model. Whenthe reference is either exogenous or equal to the certainty equivalent of rewards,their model predicts that over some interval pay may be insensitive to performance.Bonuses may arise when the reference is the median reward. In the lattercase, the principal can provide insurance and incentives avoiding the loss area bylowering the median and by rewarding good performance at the same time.4

Optimal binary payment schemes also arise in Herweg et al. (2010). Their oneperiodprincipal agent model assumes a piece-wise linear gain-loss function andreference formation as in Koszegi and Rabin (2006, 2007); i.e., the agent derivesgain-loss utility by comparing the actual payment with his (lagged) rational expectationsabout rewards. In particular, Koszegi and Rabin’s formulation assumesthat a stochastic outcome is evaluated according to its expected utility, with eachpayment being compared to all outcomes in the support of the reference lottery.**Under** this setting, the optimal scheme is a lump sum bonus contract with thebonus paid whenever a certain level of performance is achieved. The intuitionis that under this reference formation process, a contract specifying many possiblewages induces the agent to make comparisons that may yield losses or gains.Given loss aversion, on net news hurt the agent, reducing his expected utility andincreasing the average payment needed to make him participate. A simple binarycontract induces effort and minimizes the cost of inducing the agent to accept thecontract. 3Iantchev (2011) provides a structural estimation of the one-period principal-agentmodel under moral hazard and a loss averse agent. The model allows for multipleprincipals and multiple agents. The reference point is determined by the expectedvalue of the payment under the equilibrium contract in the market, as defined byRayo and Becker (2007). **Under** these assumptions, optimal pay may also be insensitiveto performance in an interval. Because the agent is assumed risk-lovingin the loss space, the payment scheme displays a discontinuity whenever outputfalls below a threshold. 4To our knowledge, Macera (2012) is the only other paper that considers lossaversion in a dynamic setting of the principal-agent model. Macera (2012) extendsthe static model to a two period model in which the agent is loss averse tochanges in beliefs about present and future consumption as in Koszegi and Rabin(2009), an extension of the static environment assumptions of Koszegi and Rabin(2006, 2007). **Under** certain conditions on the relative strength of current andfuture gain-loss utility, the optimal contract offers a fixed wage in the first periodand an output contingent increasing wage scheme in the second period. That is,the principal defers incentives to the future.Our model shares many features of this growing body of literature that modifiesthe principal-agent model by assuming that the agent is loss averse. Ourmodel enriches the setting, however, by allowing for a T period relationship andby analyzing the properties of the optimal contract in addition to its complexity-3 Daido and Itoh (2006) also allow for loss aversion and the Koszegi and Rabin (2006, 2007) referenceformation. The model assumes a binary measure of performance. Unfortunately, little can be saidabout the form of the optimal contract under this binary output measure assumption.4 The discontinuous drop in payments when the observed outcome measure is low is also found inDittmann et al. (2010) who also assume risk-loving in the loss space. The paper calibrates a one periodprincipal agent model in order to explain CEO compensation. The calibration assumes, though, thatthe optimal scheme is piecewise linear.5

enegotiation proofness, spot contractibility, wage persistence and the role of theconstrained savings assumption. It also shows that predicted contracts may notbe as simple once dynamics is considered.In addition, since loss aversion induces a discontinuity in marginal utility, werigorously derive the optimality conditions using convex analysis tools. We showthat the program the principal faces has a concave objective function and thatthe feasible set is convex. Therefore, the optimum can be characterized by a “zerobelongs to the subgradient set” condition (Rockafellar, 1970 and 1974; Rockafellarand Wets, 1997). This methodology is useful in dealing with non-differentiabilitiesand the lack of validity of the usual first order conditions.The remainder of the paper is organized as follow. In Section 2 we present themodel. In Section 3 we derive the optimality conditions in the full-commitmentcase with no access to credit markets to characterize the shape of the optimalsecond best payment scheme. We also derive the main intra and intertemporalproperties of this optimal contract. In Section 4 we analyze the optimal schemeunder monitorable access to credit. In Section 5, and in order to illustrate ourfindings, we develop a numerical example of the solution of a two period model.Finally, we conclude in Section 6.2 The ModelThe model describes a repeated principal-agent problem analogous to the dynamicmoral hazard model of Rogerson (1985) and ?; e.g., we assume finite horizon, discountingand a risk neutral principal who can borrow and save at a fixed interestrate. Our model differs, however, in that we assume a loss averse agent.The relationship between the principal and the agent lasts T + 1 periods. In eachperiod i ∈ {0, . . . , T } the agent exerts an unobservable action a i ∈ {a L , a H } witha L < a H . The outcome in period i is denoted x i ∈ [x i , ¯x i ] with a differentiabledistribution function f i (x i |a i ), where a i denotes the action chosen. The distributionsof outcomes are independent across periods conditional on actions. Weassume that these distributions exhibit the Monotone Likelihood Ratio Property(MLRP); that is, if we denote fa i i(x i |a i ) = f i (x i |a H ) − f i (x i |a L ), then f a i i(x i |a i )f i (x i |a i )isnon-decreasing in x i . We let the wage schedule in period i depend on the outcomesrealized in the current and in all previous periods and denote it ω i (x 0 , x 1 , . . . , x i ).Let c i be the agent’s consumption in period i and R i > 0 the correspondingreference point. Also let ψ i (·) represent an increasing and convex cost function.Then the agent’s utility isŨ(c i , R i ) − ψ i (a i ) (1)We denote the cost difference between the low and high action as ∆ψ i = ψ i (a H )−ψ i (a L ).To allow for loss aversion we assume that the agent’s preferences are characterized6

y the following property:Ũ i (R + t, R) −limŨi(R, R) Ũ i (R − t, R) −< limŨi(R, R)t→0 + tt→0 + −tthat is, the left-sided derivative of Ũi is greater than its right-sided derivative. Inother words, we assume that the utility gain of consumption over R is lower thanthe utility loss of consumption below R in an equally sized amount. We assumethat Ũi is continuous, concave and differentiable at all points other than R, thatlim c→0 Ũ i (c, R) = −∞ and lim c→∞ Ũ i (c, R) = ∞. 5For l 0 > 0, an exogenous reference level R 0 and a smooth, concave and strictlyincreasing function U(·), given R 0 , without loss of generality, the period 0 utility,Ũ 0 can be expressed as: 6Ũ 0 (c 0 , R 0 ) = U(c 0 ) − l 0 θ(c 0 , R 0 ) (U(R 0 ) − U(c 0 ))whereθ(c, R) ={1 if c < R0 otherwise.(2)A graphical representation of the first period utility function is given in Figure2. As the reference increases from R to R ′ , utility above R ′ remains unaffected,while it drops below R ′ .For periods i > 0, we assume that the reference level depends dynamically onthe consumption that took place in the previous period; more specifically, we assumethat R i+1 equals c i . This assumption is based on psychological evidenceindicating that individual choices depend not only on current consumption butalso on previous levels of consumption (see ?). ?, ?, Iantchev (2011) and Dittmanet al. (2010) have also assumed similar forms of update. 7We assume the utility in period i + 1 takes the following form:Ũ i+1 (c i+1 , R i ) = Ũi+1(c i+1 , c i ) = U(c i+1 ) − l i+1 θ(c i+1 , c i ) (U(c i ) − U(c i+1 ))with θ(c i+1 , c i ) defined as in (2) and l i+1 > 0. We assume the agent discounts thefuture exponentially at rate δ. We also assume that l i δ < 1 in order to ensurethat the total utility of two consecutive periods is increasing in the consumption5 Note that we do not assume diminishing sensitivity, that is, that the agent is risk averse in gainsbut risk loving in losses.6 A proof can be found in the appendix.7 Other papers assume different processes for reference formation. For instance, Koszegi and Rabin(2006 and 2007) use the rational expectation of consumption. Gul (1991) takes the certainty equivalent.Chetty and Szeidl (2010) derive a reference that partly depends on past consumption in a model ofadjustment costs in consumption. The issue of reference formation is still an understudied problem.7

of the first. 8The principal is risk neutral and therefore, for any given outcome x i , her utilityis x i − ω i (x 0 , x 1 , . . . , x i ). Finally, we assume that the principal also discountsthe future exponentially at rate δ.Figure 1: Utility function for different levels of reference.8 Recall that utility is decreasing in the reference for consumption levels below the reference. Inaddition, we let l i to vary across periods to allow for generality. For instance, Harrison and List (2004)shows that loss aversion can be mitigated by market experience, suggesting that l i might decrease overtime.8

3 Full commitment and no access to credit marketsWe start by analyzing the optimal contracting problem under the assumptions offull commitment and no access to credit markets. That is, we assume that boththe principal and the agent are able to commit to the contract during the wholeduration of the relationship, and that the principal can borrow and save at thefixed interest rate 1 δ− 1, but that the agent can neither borrow nor save.Whenever the agent has no access to credit markets, he must consume his currentincome. Thus, in this case, the principal faces the following program,subject to∑ Tmax δ i E (x i − ω i |a 0 , a 1 , . . . , a i )(ω i(·)) i,(a i(·)) ii=0T∑i=0)δ i E(Ũi (ω i , R i ) − ψ i (a i )|a 0 , a 1 , . . . , a i ≥ U ∗(PC)a(·) = (a 0 , a 1 (·), . . . a T (·)) ∈ argmax a(·)T∑i=0)δ i E(Ũi (ω i , R i ) − ψ(a i )|a 0 , a 1 , . . . , a i(IC)where E(·|a 0 , a 1 , . . . , a i ) denotes an expectation given actions (a 0 , a 1 , . . . , a i ). 9The objective function represents the expected payment to the principal. Thefirst constraint (PC) is the standard participation constraint. It prescribes thatthe agent must obtain an expected utility of a at least U ∗ from the relationship.Constraint (IC) states that the effort chosen maximizes the expected utility of theagent, and is henceforth referred to as the incentive compatibility constraint.3.1 Optimality conditionsIn order to find optimality conditions, it is convenient to define a function h i (v 0 , v 1 , . . . , v i )that represents the cost to the principal of providing a level of utility v i in periodi whenever the utility provisions in the previous periods were {v 0 , v 1 , . . . , v i−1 }.The utility provision cost v i depends on the realized outcomes up to time i,(x 0 , x 1 , . . . , x i ). Note that because of the reference dependent preferences andthe dynamic update assumption, the provision of any given level of utility affectsthe shape of the agent’s utility in future periods.9 This expectation evaluated on an arbitrary function g(x 0 , . . . , x i ) is defined as,ˆE (g|a 0 , a 1 , . . . , a i ) = g(x 0 , x 1 , . . . , x i )f 0 (x 0 |a 0 )f 1 (x 1 |a 1 (x 0 )) · · · f i (x i |a i (x 0 , . . . , x i−1 ))dx 0 dx 1 . . . dx i .9

We can now rewrite the program the principal faces as choosing utility provisionsv i (x 1 , x 2 , . . . , x i ) contingent on the outcomes up to period i as follows,subject to∑Tmax(v i(·)) i,(a i)δ i E (x i − h i (v 0 , v 1 , . . . , v i )|a 0 , a 1 , . . . , a i ) (3)i=0T∑δ i (E (v i |a 0 , a 1 , . . . , a i ) − ψ i (a i )) ≥ U ∗i=0(PC’)a = (a 0 , a 1 (x 0 ), . . . a T (x 0 , x 1 , . . . , x T −1 )) ∈ argmax aT∑i=0δ i (E (v i |a 0 , a 1 , . . . , a i ) − ψ(a i ))h i (v 0 , v 1 , . . . , v i ) is an increasing and continuous function given by,( )vi+l iU(h i−1(v 0,v 1,...,v i−1))1+l i(IC’){U−1if v i < U(h i−1 (v 0 , v 1 , . . . , v i−1 ))h i (v 0 , v 1 , . . . , v i ) =U −1 (v i ) if v i ≥ U(h i−1 (v 0 , v 1 , . . . , v i−1 ))(4)where U(h −1 ) = R 0 .The following three properties determine the optimality conditions for the paymentscheme. Property 1 proves that the problem is indeed convex which allowsfor the use of subdifferential calculus in finding the conditions for the optimal contract.That is to say, we can write a Lagrangian as in the everywhere differentiablecase, and find the optimal contract such that zero belongs to the sub-gradient setof the Lagrangian. 10 Property 2 characterizes the sub-gradient set of the cost functionwhich is a crucial step in computing the subgradient set of the Lagrangian.Property 3 describes the optimality conditions.Property 1. [Convexity]**Under** the assumptions of the model, the utility provision cost functions h i :R i → R for i ∈ 1, . . . , T are strictly convex and therefore the optimization problemgiven by (3)-(PC’)-(IC’) has a strictly concave objective function and a convexfeasible set.Proof. See appendix.Property 2. [Subgradient set]10 Note that in the differentiable case we would need to find a zero of the differential with respectto the optimization variables given the multipliers. Since in this case the objective function is notdifferentiable, optimization can be attained by finding the sub-gradient set. A reference for these ideasis ?.10

The subgradient set 11 of h i (v 0 , v 1 , . . . , v i ) is given by⎛ ⎛∂h i (v 0 , v 1 , . . . , v i ) = ⎝ 1 i∏⎝U ′ (ω i )whereProof. See appendix.t=j+1⎞k t (x 0 , x 1 , . . . , x t )l t⎠1 + k t (x 0 , x 1 , . . . , x t )l t⎧⎨ {1} if ω t (x 0 , x 1 , . . . , x t ) < R tk t (x 0 , x 1 , . . . , x t ) ∈ [0, 1] if ω t (x 0 , x 1 , . . . , x t ) = R t⎩{0} otherwise⎞i1⎠1 + k j (x 0 , x 1 , . . . , x j )l jj=0(5)(6)Property 3. [Optimality conditions]There is a unique optimal wage schedule that solves the program faced by theprincipal and it is characterized by the following optimality conditions,(1U ′ (ω i (x 0 , x 1 , . . . , x i )) = (1 + k f i )ai(x 0 , x 1 , . . . , x i )l i ) λ i + µ i(x i |a i )if i +(x i |a i )ˆfa i+1−δl i+1 k i+1 (x 0 , x 1 , . . . , x i+1 )(λ i+1 +µi+1(x i+1 |a i+1 )i+1ω i+1≤ω if i+1 (x i+1 |a i+1 ) )f i+1 (x i+1 |a i+1 )dx i+1 .and(1U ′ (ω T (x 0 , x 1 , . . . , x T )) = (1 + k f T )aT (x 0 , x 1 , . . . , x T )l T ) λ T + µ T(x T |a i )Tf i (x i |a i )wheref k a k(x k |a k )• λ i = λ + ∑ i−1k=0 µ k , with λ the multiplier associated to (PC’) andf k (x k |a k )µ i = µ i (x 0 , . . . , x i−1 ) the multipliers associated to the incentive compatibilityconstraints.• The function k i (x 0 , x 1 , . . . , x i ) is associated to the kink in the utility functionand is given by⎧⎨ {1} if ω i (x 0 , x 1 , . . . , x i ) < R ik i (x 0 , x 1 , . . . , x i ) ∈ [0, 1] if ω i (x 0 , x 1 , . . . , x i ) = R i⎩{0} otherwiseProof. See appendix.(8)∀i < T(7)Note that whenever l i = 0 ∀i the optimality condition describes the solutionto the canonical case. In addition, the optimality condition for a spot contract isgiven by equation (8) when T = 0, as in ?. In the following subsection we describethe main intra and intertemporal properties of this optimal scheme.11 By definition we know that for a generic convex function f : R n+1 → R the subgradient set atx ∈ R n+1 is given by the set of vectors d = (d 0, d 1 , . . . , d n+1 ) ∈ R n+1 such that for any vector α ∈ R n+1f(x + α) ≥ f(x) + d · α11

3.2 Properties of the optimal payment schemeInspection of equations (7) and (8) imply that there are two main propertiesthat distinguish the shape of optimal payment scheme from the classical case.First, it can have flat segments. This is explained by the multiplicative term(1 + k i (x 0 , x 1 , . . . , x i )l) in (7) and (8). At the reference level, k i (x 0 , x 1 , . . . , x i )is allowed to take any value in [0, 1]. Therefore, the right hand side of equations(7) and (8) can remain constant in an interval: as x i increases, k i (x 0 , x 1 , . . . , x i )decreases and ω i (x 0 , x 1 , . . . , x i ) remains at the reference. Intuitively, the cost ofinducing effort by increasing payments right above the reference may be high dueto the discontinuous fall in marginal utility. Similarly, although effective in providingincentives, a reduction in payment just below the reference increases thecost of inducing participation, again due to the discontinuity in marginal utility.The second difference relates to the fact that the principal takes into accountthat each period’s payment affects the reference level of the following period. Thelast term of the right hand side of equation (7), which is strictly positive, representsthis effect. In each period i, this term tends to lower the payment schemeand to reduce its growth rate as x i increases. Intuitively, the term represents thebenefit of lowering the payment in the current period, in order to reduce the followingperiod’s reference and to increase the utility of the agent in the loss area.Naturally, this effect does not occur in T (equation (8)).On the basis of these optimality results, in the following property we show thatthe optimal wage schedule can be insensitive to outcomes in an interval. We thenexamine memory and the relationship between payments across any pair of consecutiveperiods.Property 4. [Shape of the optimal payment scheme] If l i−1 ≥ l i ≥ l i+1 andδl i ≤ 1, then ω i (x 0 , x 1 , . . . , x i ) is continuous and non-decreasing in x i and,1. For i ∈ {0, . . . , T }, ω i (x 0 , x 1 , . . . , x i ) is non-decreasing in x j for j ∈ {0, . . . , i}.2. For i ∈ {1, . . . , T }, if ω i−1 (x 0 , x 1 , . . . , x i−1 ) > R i−1 then for any value of(x 0 , x 1 , . . . , x i−1 ) it must be the case that ω i (x 0 , x 1 , . . . , x i ) = R i for someoutcome x i ∈ [x i , ¯x i ] . Furthermore the payment scheme has a flat segmentat the reference and therefore, ω i is not strictly increasing.3. For i ∈ {1, . . . , T −1}, if (l i −l i+1 )δ ≥ l i−1 −l i and ω i−1 (x 0 , x 1 , . . . , x i−1 ) ≤R i−1 then, for any value of (x 0 , x 1 , . . . , x i−1 ), ω i (x 0 , x 1 , . . . , x i ) = R i forsome outcome x i ∈ [x i , ¯x i ] . Furthermore the payment scheme has a flatsegment at the reference and, therefore, ω i is not strictly increasing.4. If ω T −1 (x 0 , x 1 , . . . , x T −1 ) ≤ R T −1 then, for any value of (x 0 , x 1 , . . . , x T −1 ),ω T (x 0 , x 1 , . . . , x T ) = R T for some outcome x T ∈ [x T , ¯x T ]. The paymentscheme has a flat segment at the reference but it cannot be fully flat in x T .Proof. See appendix.12

Figure 2: Schematic representation of monotonicity of contractsThe previous property states that the payment scheme is, as in the canonicalcase, non-decreasing. However, starting in period 1 the reference wage must bepaid for an interval of outcomes. Figure 3.2 is an schematic representation of themonotonicity of possible payment schemes. According to Property 4, panels (a),(b) and (c) are possible in all periods. Panel (d) is possible in periods {0, . . . , T −1}. Flat schedules are not optimal in period T . Two elements distinguish periodT . One refers to the fact that consumption in T does not affect the reference offuture periods, so the effect of a high wage in T on the cost of providing utility islimited. The other is that there is a positive probability that the agent perceivesa fixed wage over T − 1 periods. In that case, all incentives must be deferredto the last period with schemes that are at least partly sensitive to outcomes.Macera (2009) finds a similar result in a two-period model. Finally, panel (e) ispossible in period 0 only. In period 0, the payment scheme may or may not offerthe reference level of consumption for some outcome.Next we analyze how the payment scheme depends on the history of outcomes.Just as in the canonical model, consumption smoothing requires that a higherpayment in one period results in a higher payment in all subsequent periods. Inthe classical model the implication is that wage schedules are strictly increasing inevery realized outcome. However, in our model, wage schedules may overlap foroutcomes that pay the reference level. This result is summarized in the followingproperty.Property 5. [Dependence across periods]Let x ′ i < x′′ i two possible outcomes in period i.13

Figure 3: Optimal contracts, outcomes between two consecutive periods1. If ω i (x 0 , x 1 , . . . x ′ i ) = ω i(x 0 , x 1 , . . . x ′′i ) = R i thenω j (x 0 , x 1 , . . . x ′ i, x i+1 , . . . , x j ) ≤ ω j (x 0 , x 1 , . . . x ′′i , x i+1 , . . . , x j ) ∀j > i ∀x j ∈ [x j , x j ]2. If ω i (x 0 , x 1 , . . . x ′ i ) < ω i(x 0 , x 1 , . . . x ′′i ) thenω j (x 0 , x 1 , . . . x ′ i, x i+1 , . . . , x j ) < ω j (x 0 , x 1 , . . . x ′′i , x i+1 , . . . , x j ) ∀j > i ∀x j ∈ [x j , x j ]Proof. It follows directly from (7) and (8) since µ i > 0∀i.Property 5 implies that panels (a) and (b) in Figure 3.2 are possible in ourmodel. Note also that only (c) is possible in the canonical model. This propertyimplies in turn that there is a positive probability that wages exhibit time persistence,even if realized outcomes differ over time.Next, we find an analogue to the relationship between the wage schedulesoffered in any two consecutive periods as was first derived by Rogerson (1985)under classical assumptions. In the classical case, the inverse of the marginalutility of income must equal the conditional expected value of the inverse of thenext period’s marginal utility of income. This condition is no longer valid inour model. However, an extended condition can be derived as is stated in thefollowing property. Let ω i (x 0 , x 1 , . . . , x i ) be denoted ω i (x i ) to simplify notation,and similarly let k i (x i ) denote k i (x 0 , x 1 , . . . , x i ).Property 6. [Relationship between two consecutive periods] The following relationshipbetween two consecutive periods is fulfilled,ˆ11U ′ =(ω i−1 (x i−1 ))(1 + k i−1 (x i−1 )l i−1 ) U ′ (ω i (x i ))(1 + k i (x i )l i ) f i (x i |a i )dx i + c(x i−1 )14

whereˆ (l i δc(x i−1 ) = −k i (x i )1 + k i−1 (x i−1 )l i−1ˆ ˆ (ki+1 (x i+1 )l i+1 δ1 + k i (x i )l iλ i + µ if i a i(x i |a i )f i (x i |a i )Proof. Follows directly from equations (7) and (8).)f i (x i |a i )dx i + (9)f i+1)a i+1(x i+1 |a i+1 )λ i+1 + µ i+1f i+1 f i+1 (x i+1 |a i+1 )f i (x i |a i )dx i+1 dx i(x i+1 |a i+1 )Property 6 implies that the inverse of the marginal utility of income might begreater or smaller than the conditional expectation of the marginal utility. 12This relationship between any two adjacent periods implies in the classical casethat the optimal contract front-loads consumption; i.e., if allowed to save or borrowafter the realization of the current period outcome, the agent will choose tosave. In our model, however, this will not necessarily be the case: the agent mayhave incentives to save, borrow or consume her full allocation ex-post. Furthermore,he may face utility losses by either saving or by borrowing and thus mightbe inclined to consume his full allocation.Intuitively, assume the loss averse agent faces the possibility of reallocating resourcesbetween any two consecutive periods, i and i + 1. If he borrows, the effecton current marginal utility depends on whether today’s income is over or underthe reference. Due to loss aversion, whenever the current payment is at or overthe reference, the gain in utility is small relative to whenever current income isbelow the reference. Similarly, the effect on tomorrow’s marginal utility might berelatively high if because of previous period’s borrowing, consumption falls underthe reference. These effects may make borrowing unattractive to the agent. Reinforcingthis effect, this intertemporal reallocation of resources increases periodi + 1 reference, reducing utility and increasing the marginal utility of consumptionin the loss area. A similar argument applies to the decision of saving. Thusthe possibility of facing marginal losses that may be larger than marginal gains,jointly with the intertemporal effects on the reference, imply that the agent willface situations in which he would face a loss in utility by either saving or borrowing.He will thus decide to consume the full allocation.This discussion is an application of the “status quo bias” as described by ?. Arelated phenomenon is that this situation may lead to a gap between willingnessto pay (WTP) and willingness to accept (WTA). Assume the agent is in a situationin which he would lose by either saving or borrowing at the market interestrate 1/δ − 1. However, if given the possibility of lending a part of his income atinterest rate r l or borrowing at interest rate r b , indifference between lending and12 There is some abuse of language here since we refer as marginal utility of Ũ i to the termU ′ (ω i (x i ))(1 + k i (x i )l i ) since both quantities are equal for all incomes except the reference. Atthe reference level the marginal utility is not computable and could take any value between[U ′ (ω i (x i )), U ′ (ω i (x i ))(1 + l i )].15

orrowing would require r l > 1 δ − 1 > r b. That is, the smallest price at which heis willing to lend is strictly greater than the largest price he is willing to pay forborrowing. This is not the case in the canonical model since indifference betweenlending and borrowing for any set of payment schemes in any two consecutive periodscan be attained at a single interest rate. This gap between WTA and WTPhas been described in other economical contexts (?). Moreover, if the marketoffers interest rates such that r b ≥ r l , then the agent will find himself inclined toconsume his allocation.An important consequence is that under our assumptions the optimal paymentscheme does not always require constrained savings, a problematic property of thecanonical model (Chiappori et al., 1994). In particular, renegotiation proofness inthe canonical case relies on the assumption of constrained savings, either becausecredit is not available or because the principal can monitor the agent’s actions inthe credit market. If the agent has access to the credit market but his savings cannotbe contracted upon, then the full commitment long-term optimum is ex-postinefficient. Moreover, the renegotiation-proof long-term contract cannot provideincentives to implement any effort above the minimum. Since it is unlikely that acourt of law would prevent renegotiation towards a Pareto-improving agreementand since constraining savings may be implausible in most contexts, the classicaltheory cannot explain the existence of long-term commitment contracts (Chiapporiet al., 1994). Thus loss aversion and our assumed dynamic update of thereference might give a rationale for the existence of commitment contracts.A formalization of this discussion is presented in the following properties.Property 7. [Status quo bias]• If the payment in period i is y and is at the reference and the payment inperiod i + 1 is constant such that y i+1 (x i+1 ) = y ∀x i+1 ∈ [x i+1 , ¯x i+1 ],then the agent will neither want to save nor borrow at the rate 1 δ− 1. Moreover,if r l is the rate that makes the net marginal utility of saving equal tozero and r b the rate that makes the net marginal utility of borrowing equalto zero, then r l > 1 δ − 1 > r b.• Let y i be the payment in period i and y i+1 (x i+1 ) the payment scheme inperiod i + 1. If y i+1 pays the reference with positive probability or y i is atthe reference, then the rate r l that makes marginal utility of saving equalto zero is strictly greater than the rate r b that makes the marginal utility ofborrowing equal to zero.Proof. See appendix.Property 8. [Intertemporal allocation of resources] If the interest rate is 1 δ − 1in period i then the agent may have incentives to save for period i + 1, to borrowand pay back in period i + 1 or to consume exactly her income depending on theparameters of the problem. Moreover,16

• If period’s i + 1 payment scheme is over the reference for all results, then theagent does not have incentives to save in period i.• If period’s i + 1 payment scheme is below the reference for all outcomes inperiod i + 1 then– if period’s i payment is strictly above the reference then the agent hasincentives to save in period i.– If period’s i payment is at the reference, the agent will not have incentivesto borrow in period i.– If period’s i payment is strictly below the reference the agent may haveincentives to save, to borrow or to consume her allocation.Proof. See appendix.The previous property states, among other things, that if the payment schemeis flat for low outcomes and then increasing for larger outcomes, then the agentwill not have incentives to save. In the following subsection we show that thepayment scheme will not take values under the reference as long as the cost tothe principal of the participation constraint λ is sufficiently high.3.3 The shape of the optimal contract and the shadowcost of participationIn a three period context we analyze how the optimal payment scheme changesif the cost for the principal of the participation constraint were to change. Weanalyze this case by studying the effects of a change in the multiplier λ.Property 9. [Cost of (PC) and shape of optimal contract]There is a value ¯λ such that if λ ≥ ¯λ the optimal payment scheme is strictlyabove the reference, for all realizations, in each period (letting the other multipliersbe fixed).Proof. See appendix.Recall that λ is the shadow cost of relaxing the participation constraint. Thisproperty says that if the participation constraint is sufficiently costly then thecontract must be backloaded. It suggests that the cost of the participation constraintto the principal is closely related to whether incentives are to be createdthrough rewards or punishments. This result is highly intuitive: providing incentivesthrough the threat of payments below the reference creates a great loss inutility for an agent whose participation is already very costly. The principal isbetter off thereby providing incentives through rewards only.17

3.4 Renegotiation-proofness and spot-implementabilityThe optimal contract scheme is renegotiation-proof just as in the classical case.As a matter of fact, ? has shown that the renegotiation-proofness property doesnot rely on the differentiability of the utility function. **Under**lying this result isthe assumption that the agent is able to predict how her utility updates in eachperiod. If this were not the case, the property might not hold.However, the optimal sequence of spot contracts exhibit memory, unlike the classicalcase, because of our assumption on the dynamic update of the referencelevel. The utility function of the agent changes from period to period with thereference, and so may the reservation utility. In order for the full-commitmentcontract with no access to credit markets to be implemented by a sequence of spotcontracts, the update of the reservation utility after each outcome must exactlycorrespond to the continuation payoff that the agent expects under this contract.Thus, the optimal commitment contract may not be implemented by a sequenceof spot contracts.Property 10. [Optimal spot contracting]The optimal sequence of spot contracts exhibits memory and it will not implementthe full-commitment optimum in general.This result follows by backwards induction as in ?.4 Monitorable access to creditWe now move to the case when the agent can reallocate resources intertemporallyby borrowing or saving through the credit market. We further assume that theagent’s trades in the credit market can be monitored, so savings can be contractedupon. The program that the principal faces is then the following,∑ Tmaxδ i E (x i − ω i (x 0 , x 1 , . . . , x i ) − S i (x 0 , x 1 , . . . , x i )|a 0 , a 1 , . . . , a i )(ω i (·)) i (a i ) i ,(s i ) i (S i ) ii=0subject toT∑i=0( ) )δ i E(Ũi (ω i (x 0 , x 1 , . . . , x i ) − s i (x 0 , x 1 , . . . , x i ), c i−1 )|a 0 , a 1 , . . . , a i − ψ i (a i ) ≥ U ∗(PC)(a 0 , a 1 (x 1 ), . . . a T (x 0 , x 1 , . . . , x T )) ∈ argmax−→ a T ∑i=0( ) )δ i E(Ũi (ω i (x 0 , x 1 , . . . , x i ) − s i (x 0 , x 1 , . . . , x i ), c i−1 )|a 0 , a 1 , . . . , a i − ψ(a i )where s i are the agent’s accumulated savings in period i; that is, the netsavings of the agent in period i once the endowment derived from previous savingsis taken into account. Similarly, S i are the accumulated savings of the principal.It is easy to see that the previous program is equivalent to one in which theoptimization variables are the consumptions of the agent in each period, given(IC)18

y c i (x 0 , x 1 , . . . , x i ) = ω i (x 0 , x 1 , . . . , x i ) − s i (x 0 , x 1 , . . . , x i ), and the aggregateaccumulated savings, given by s i + S i , and constraint s T = − s T −1δ. Since theprincipal is assumed risk neutral, the optimality conditions are similar to (10)and (8) with c i replacing reward ω i . Therefore, the optimality conditions forconsumptions with monitorable access to credit are as follows,(1U ′ (c i (x 0 , x 1 , . . . , x i )) = (1 + k f i )ai(x 0 , x 1 , . . . , x i )l i ) λ i + µ i(x i |a i )if i +(x i |a i )ˆfa i−δl i+1 k i+1 (x 0 , x 1 , . . . , x i+1 )(λ i+1 +µi+1(x i+1 |a i+1 )i+1f i (x i+1 |a i+1 ) )f i (x i+1 |a i+1 )dx i+1 .and(1U ′ (c T (x 0 , x 1 , . . . , x T )) = (1 + k f T )aT (x 0 , x 1 , . . . , x T )l T ) λ T + µ T(x T |e i )Tf i (x i |e i )∀i < T(11)(10)where k i (x 0 , x 1 , . . . , x i ) is given by,⎧⎨ {1} if ω i (x 0 , x 1 , . . . , x i ) < R ik i (x 0 , x 1 , . . . , x i ) ∈ [0, 1] if ω i (x 0 , x 1 , . . . , x i ) = R i⎩{0} otherwiseThis result is similar to what is obtained in the classical case: there is astrong relationship between the monitorable access to credit case and the fullcommitmentwith no credit access case. Furthermore, just like in the classicalcase, monitoring borrowing and savings introduces memory to the principal-agentrelationship and therefore the optimal long-term contract will be spot-contractible.These results are summarized in the following property.Property 11. [Spot contractibility under monitorable credit]Suppose the reservation utility U ∗ i (s i−1, R i ) in period i depends on the savingsthat the agent has in period i − 1 and on the reference level R i , and that it iscontinuous, increasing in s i−1 . Then, under monitorable savings, the long-termoptimal contract is spot contractible.Proof. See appendix.5 A two period exampleIn this section we numerically compute the optimal payment scheme in a twoperiod setting in order to illustrate the forms these optimal contracts can take.We assume the distribution function of outcomes x i ∈ [0, 1] in period i ∈ {1, 2},for effort level a j ∈ {a L , a H } is a triangular function given byf i (x i |a j ) ={ 2xia jx i ≤ a j2(1−x i )(12)1−a jx i > a j19

Note that in this case f i a i(x i |a i )f i (x i |a i )may not be strictly increasing with respect tooutcomes x i . We also assume a H = 1 and a L = 0.1 in which case f a i i(x i |a i )f i (x i |a i )constant in [0, 0.1].We assume that utility is described by the function U(Y ) = √ Y and thereforeŨ i (Y i , R i ) = √ Y i − θ(Y i , R i )l i ( √ R i − √ Y i )isThe optimality conditions are,(1U ′ (ω 0 (x 0 )) = 2√ f 0 )aω 0 (x 0 ) = (1 + k 0 (x 0 )l 0 ) λ 0 + µ 0(x 0 |a 0 )0f 0 +(x 0 |a 0 )ˆfa 1 − δl 1 k 1 (x 0 , x 1 )(λ 1 + µ 1(x 1 |a 1 )1ω 1≤ω 0f 1 (x 1 |a 1 ) )f 1 (x 1 |a 1 )dx 1 . (13)1U ′ (ω 1 (x 0 , x 1 ))= 2 √ (f 1 )aω 1 (x 0 , x 1 ) = (1 + k 1 (x 0 , x 1 )l 1 ) λ 1 + µ 1(x 1 |a 1 )1f 1 (x 1 |a 1 )(14)5.1 Case 1: First period payment independent of outcomesThe first example illustrates a case in which the first period payment does notdepend on the outcomes that take place in that same period (see Figure 5.1).That is, the first period payment is constant at the reference level. The secondperiod payment scheme is contingent on outcomes obtained on the first and secondperiods as depicted in Figure 5.1. 13 The values of parameters used in thissimulation are given in the following table, along with the multipliersl 0 l 1 a H a L 1/U ′ (R 0 ) λ µ 0 µ 11 1 1 0.1 37 46.1 0.5 2Note that the second period scheme falls below the reference if low outcomesare realized in the first period. For outcomes in the first period that are greaterthan a threshold, the agent faces a payment scheme in the following period thatis greater or equal than the payment received in the first period. Consequently,according to Property 8, he will not have incentives to save for outcomes above athreshold.13 Note that the flat segments for small values of first and second period outcomes are due to thefact that f i a i(x i|a i)f i (x i|a i)is constant in [0, 0.1].20

Figure 4: Case 1. First Period Payment SchemeFigure 5: Case 1. Second Period Payment Scheme21

Figure 6: Case 2. First Period Payment Scheme5.2 Case 2: First period payment greater or equal thanR 0The next example illustrates a case in which the first period payment reaches thereference for low values of the first period outcome (see Figure 5.2). The secondperiod payment scheme is shown in Figure 5.2. The values of parameters used inthis simulation are given in the following table, along with the multipliersl 0 l 1 a H a L 1/U ′ (R 0 ) λ µ 0 µ 11 1 1 0.1 15 30.1 1 15.3 Case 3: Second period payment over reference forall outcomesThis example illustrates a case in which payments are over the reference for alloutcomes in the first and second periods (see Figure 5.3). The parameters usedare the same as in Case 2, except that the cost for the principal of inducingparticipation is higher, which is reflected in a higher λ (Property 9). The secondperiod payment scheme is contingent on outcomes obtained on both the first andsecond periods as shown in Figure 5.3. According to Property 8, in this case theagent does not have incentives to save for any possible outcome realization in thefirst period. The payment schemes shown in Figures 5.3 and 5.3 are efficient,22

Figure 7: Case 2. Second Period Payment Schemerenegotiation-proof and they implement the high level of effort in both periodsif the agent is restricted to borrow. The values of the parameters used in thissimulation are given in the following table, along with the multipliers.l 0 l 1 a H a L 1/U ′ (R 0 ) λ µ 0 µ 11 1 1 0.1 15 40.1 1 16 Conclusions and Final RemarksIn this paper we extend the dynamic moral hazard principal-agent model firstderived by Rogerson (1985) to allow for an agent who is loss averse and whosereference updates according to previous consumption. We analyze the optimalcontracting problem under two scenarios. In the first one the agent has no accessto credit markets and thus needs to rely on the principal to transfer resourcesover time. In the second one the agent has access to credit but the principal canmonitor her savings.When the agent has no access to credit markets and is forced to consume herearnings, we find that the optimal payment scheme can have flat segments at23

Figure 8: Case 3. First Period Payment SchemeFigure 9: Case 3. Second Period Payment Scheme24

the reference; i.e., the wage may be insensitive to outcomes in an interval. Thisproperty implies, in turn, that there is a positive probability of observing constantwages over time, even though the contract scheme displays memory –it dependson the full history of outcomes–. Moreover, the model predicts a “status quo bias”whenever the agent is allowed to borrow or save after the outcome is realized,whereas in the canonical model, if anything, he would like to save for future periods.In other words, there is a gap between the interest rates at which the agentis willing to lend and borrow ex-post.We also show that although the optimal contract scheme is renegotiation-proof,it cannot be implemented by a sequence of spot contracts because of the dynamicupdate process we assume for the reference. However, when the agent has accessto the credit market and the principal can monitor savings, then the long-termoptimal contract is spot contractible.In sum, this paper shows that many of the properties of the classical modelhold under our assumptions –consumption smoothing, memory and renegotiationproofness. Moreover, our model predicts new features of the optimal schemethat may help explain many of the facts described by the empirical literature onlabor contracts. First it might explain why observed contracts are fairly simplerthan those predicted by the classical theory. In fact, many authors have calledattention to the simplicity of actual contracts compared to those derived by thetheoretical literature (Chiappori and Salanie, 2000; Salanie, 2003; Bolton and Dewatripont,2005). Our model might also explain why real wages are persistentover time (Dickens et al., 2007) and why incentives tend to be deferred to thefuture (Baker, Jensen and Murphy, 1988; Baker, Gibbs and Holmstrom, 1994).That is, our model predicts features of the optimal contract that are better inline with the empirical findings, while at the same time conserving many of theproperties predicted by the classical model.Future research should analyze the robustness of our results to a number ofassumptions. In particular, a related literature on loss averse preferences hasassumed different reference formation processes. In addition, an interesting generalizationof our model is to allow for a loss averse principal. In this case, weexpect the agent and the principal to protect each other against losses wheneverthe other party’s reference point is reached.7 Appendix7.1 Ũ 0 expressed in terms of U and l 0Let’s prove that Ũ0 can be expressed asŨ 0 (c 0 , R 0 ) = U(c 0 ) − l 0 θ(c 0 , R 0 ) (U(R 0 ) − U(c 0 ))25

Define{U(c 0 ) =Ũ 0 (c 0 , R 0 ) if c 0 ≥ R 0Ũ 0 (c 0 , R 0 )L + (1 − L)Ũ(R 0, R 0 ) if c 0 ≤ R 0where L =′ +(R0 Ũ0 )Ũ 0′ −(R0)derivative of Ũ 0 at R 0 .with ′ +Ũ0 (R0 ) and ′ Ũ0−(R0 ) denoting the right and leftU as defined is continuous and differentiable with U ′ (R 0 ) = Ũ0′ +(R0 ). It isconcave since we assume Ũ0(c 0 , R 0 ) concave in c 0 .Define l 0 = 1−LL, then if c 0 ≤ R 0 we haveU(c 0 ) − l 0 θ(c 0 , R 0 ) (U(R 0 ) − U(c 0 )) =Ũ 0 (c 0 , R 0 )L+(1−L)Ũ(R 0, R 0 )− 1 − LL(Ũ0 (R 0 , R 0 ) − Ũ0(c 0 , R 0 )L − (1 − L)Ũ(R 0, R 0 ))=Ũ 0 (c 0 , R 0 )L + (1 − L)Ũ(R 0, R 0 ) − 1 − L)(Ũ0L L (R 0 , R 0 ) − Ũ0(c 0 , R 0 ) = Ũ0(c 0 , R 0 )7.2 Proof of property 1Let’s see that h i (v 0 , v 1 , . . . , v i ) is strictly convex. Since U is strictly increasingwe can write, h i (v 0 , v 1 , . . . , v i ) = U −1 (U(h i (v 0 , v 1 , . . . , v i ))), we prove thatU(h i (v 0 , v 1 , . . . , v i )) is strictly convex and increasing and we conclude by the strictconvexity of U −1 (implied by the strict concavity of U). 14 Let v i = (v 0 , . . . , v i ) andv ′i = (v 0 ′ , . . . , v′ i ) be two utility provision vectors. Denote vi−1 = (v 0 , v 1 , . . . , v i−1 ).By the definition of convexity, we need to prove,U(h i (λ(v 0 , . . . , v i )+(1−λ)(v 0, ′ . . . , v i))) ′ < λU(h i (v 0 , v 1 , . . . , v i ))+(1−λ)U(h i (v 0, ′ v 1, ′ . . . , v i)) ′ ∀λ ∈ (0, 1)(15)Note that for i = 0, by (4) U(h(v 0 )) is linear by parts, increasing and convex1(derivative for v 0 < R 0 is1+l 0< 1 and for v 0 > R 0 it is 1.). Let’s prove (15)assuming true for i − 1.If λv i + (1 − λ)v i ′ < U(h i−1(λ(v 0 , . . . , v i−1 ) + (1 − λ)(v 0 ′ , . . . , v′ i−1 ))) the utilityprovision of period i of the convex combination is in the loss area and we have()U(h i (λv i + (1 − λ)v ′ i λv i + (1 − λ)v ′)) =i + l iU(h i−1 (λv i−1 + (1 − λ)v ′ i−1 ))1 + l i≤(vi + l i U(h i−1 (v i−1 ))λ(1 − λ)1 + l i)+()v i ′ + l iU(h i−1 (v ′ i−1 ))1 + l i≤ λU(h i (v i )) + (1 − λ)U(h i (v ′ i−1 ))14Note that the composition of a convex increasing function with a convex function is convex.26

The first inequality is implied by the induction hypothesis ( and the second is ) justifiednoting that if v i > U(h i−1 ) then U(h i (v i )) = v i >vi +l i U(h i−1 (v i−1 ))1+l iand(if v i ≤ U(h i−1 ) then U(h i (v i )) =vi +l i U(h i−1 (v i−1 ))1+l i).A similar argument proves (15) for the case in which λv i +(1−λ)v ′ i ≥ U(h i−1(λ(v 0 , . . . , v i−1 )+(1 − λ)(v ′ 0 , . . . , v′ i−1 ))).U(h i (λ(v 0 , . . . , v i ) + (1 − λ)(v ′ 0, . . . , v ′ i))) = λv i + (1 − λ)v ′ i ≤ λU(h i (v i )) + (1 − λ)U(h i (v ′ i−1 ))Where ( the last inequality ) is justified by noting v iv i ≤vi +l i U(h i−1 (v i−1 ))1+l i= U(h i (v i )).< U(h i−1 (v i−1 )) impliesFinally, the constraints are linear in v i (x 0 , x 1 , . . . , x i ) for each i and, therefore,convexity of the feasible set is straightforward7.3 Proof of property 2We haveh i (v 0 , v 1 , . . . , v i ) = U −1 (U(h i (v 0 , v 1 , . . . , v i )))where{v i if v i ≥ U(h i−1 (v 0 , v 1 , . . . , v i−1 )U(h i (v 0 , v 1 , . . . , v i )) = v i+l iU(h i−1(v 0,v 1,...,v i−1))1+l iif v i < U(h i−1 (v 0 , v 1 , . . . , v i−1 )(16)By Proposition 4.2.5 in ? we know that 15.∂h i (v 0 , v 1 , . . . , v i ) = ( U −1)′ ((U ◦ h i ) (v 0 , v 1 , . . . , v i ))) · ∂ ((U ◦ h i ) (v 0 , v 1 , . . . , v i ))Now, note that from { (16) we have U(h i (v 0 , v 1 , . . . , v i )) = F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ), v i )yif y ≥ xwhere F i (x, y) =y+l i x1+l iif y < xLet (d 0 , . . . , d i−1 ) ∈ ∂ (U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ) and ( ˜d 0 , ˜d 1 ) ∈ ∂F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ), v i )let’s see that that (d 0 · ˜d 0 , d 1 · ˜d 0 , . . . , d i−1 · ˜d 0 , ˜d 1 ) ∈ ∂ (U ◦ h i ) (v 0 , v 1 , . . . , v i ). Infact, we have(U ◦ h i ) (v 0 + α 0 , v 1 + α 1 , . . . , v i + α i ) =F i ((U ◦ h i−1 ) (v 0 + α 0 , . . . , v i−1 + α i−1 ), v i + α i ) ≥ F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ) + d 0 α 0 + · · · + d i−1 α i−1 , v i + α i )15 A vector d ∈ IR n is a subgradient of f at a point x ∈ IR n , denoted d ∈ ∂f(x), if≥F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ), v i ) + d 0 ˜d0 α 0 + · · · + d i−1 ˜d0 α i−1 + ˜d 1 α if(z) ≥ f(x) + (z − x) ′ d∀z ∈ IR n27

Where the first inequality is due to (d 0 , . . . , d i−1 ) ∈ ∂ (U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 )and F i increasing in its first variable. The second inequality is implied by( ˜d 0 , ˜d 1 ) ∈ ∂F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ), v i )Let’s now see that the reverse is also true. That is, we show that an element of∂ (U ◦ h i ) (v 0 , v 1 , . . . , v i ) can be written as (d 0 · ˜d 0 , d 1 · ˜d 0 , . . . , d i−1 · ˜d 0 , ˜d 1 ) with(d 0 , . . . , d i−1 ) ∈ ∂ (U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ) and ( ˜d 0 , ˜d 1 ) ∈ ∂F i ((U ◦ h i−1 ) (v 0 , v 1 , . . . , v i−1 ), v i )Let’s compute ∂F i (x, y). If x ≠ y F i is differentiable and therefore its subgradientset coincides with the derivative.Otherwise, y = x and the elements of the subgradient set of ∂F i (x, y) will bethe pairs ( ˜d 0 , ˜d 1 ) such that,F i (x + α 0 , y + α 1 ) ≥ F (x, y) + α 0 ˜d0 + α 1 ˜d1 ∀α 0 , α 1 ∈ R (17)If α 0 ≤ α 1 then x + α 0 ≤ y + α 1 and (17) becomesy + α 1 ≥ y + α 0 ˜d0 + α i ˜d1⇐⇒ (1 − ˜d 1 )α 1 ≥ ˜d 0 α 0which is true for all α 1 ≥ α 0 if and only if (1 − ˜d 1 ) = ˜d 0 > 0. 16If α 0 > α 1 then x + α 0 > y + α 1 and (17) becomesy + α 1 + l i (x + α 0 )(1 + l i )⇐⇒≥ y + α 0 ˜d0 + α 1 ˜d1( ) (li−1 + l ˜d 0 α 0 ≥ ˜d 1 − 1 )i 1 + l iwhich is true for all α 0 < α 1 if and only if1+l i− ˜d) ( )0 = ˜d1 − 11+l i> 0.Therefore, summarizing we have established that[[ ]( ˜d 0 , ˜d 1 ) ∈ ∂F (x, x) =⇒ ˜d 0 ∈ 0,and ˜d 0 = 1 − ˜d 1]l i,1 + l ˜d 1 ∈i(li11 + l i, 1α 1And therefore, we can write,⎧⎨( ki l i∂F i (x, y) =,⎩ 1 + l i k i)1; where k i (x 0 , x 1 , . . . , x i ) ∈1 + l i k i⎧⎨⎩{1} if ω i (x 0 , x 1 , . . . , x i ) < R i[0, 1] if ω i (x 0 , x 1 , . . . , x i ) = R i{0} otherwiseSuppose that ( ¯d 0 , . . . , ¯d i ) ∈ ∂ (U ◦ h i ) (v 0 , . . . , v i ), let’s see that ( ¯d 0 , ¯d 1 , . . . ¯d i−1 , ¯d i ) =(d 0· ˜d 0 , d 1· ˜d 0 , . . . , d i−1· ˜d 0 , ˜d 1 ) for some vectors (d 0 , . . . , d i−1 ) ∈ ∂ (U ◦ h i−1 ) (v 0 , . . . , v i−1 )and16 Note that α 0 and α 1 can take negative values.⎫⎬⎭28

( ˜d 0 , ˜d 1 ) ∈ ∂F i ((U ◦ h i−1 ) (v 0 , . . . , v i−1 ), v i ).Note that by the F i ((U ◦ h i−1 ) (v 0 , . . . , v i−1 ), v i +α i ) ≥ F i ((U ◦ h i−1 ) (v 0 , . . . , v i−1 ), v i )+α i ¯di ∀α i implies we must have ¯d 1i =1+l i k iwith k i defined by (6) (subgradientset in one variable). We know that in points in which F i is differentiableits subgradient set coincides with the derivative which will be (0, 1) if(U ◦ h i−1 ) (v 0 , . . . , v i−1 ) < v i and ( l i 11+l i,1+l i) if (U ◦ h i−1 ) (v 0 , . . . , v i−1 ) > v i .Therefore, ( from Proposition 4.2.5 ? we must have that ∂U(h i (v 0 , . . . , v i )) =(1 − ¯di ) · ∂ (U ◦ h i−1 ) (v 0 , . . . , v i−1 ), ¯d)i .If F is not differentiable we have (U ◦ h i−1 ) (v 0 , . . . , v i−1 ) = v i . Let (α 0 , α 1 , . . . , α i−1 ) ∈R i , we define ˆα i = (U ◦ h i−1 ) (v 0 + α 0 , . . . , v i−1 + α i−1 ) − v i .Note that ∂U(h i (v 0 , . . . , v i )) = F ((U ◦ h i−1 ) (v 0 + α 0 , . . . , v i−1 + α i−1 ), v i + ˆα i ) =v i + ˆα i . In fact, this is straightforward if ˆα i ≥ 0. If ˆα i < 0 then F ((U ◦ h i−1 ) (v 0 +α 0 , . . . , v i−1 + α i−1 ), v i + ˆα i ) = v i+ˆα i +l i (ˆα i +v i )(1+l i )= v i + ˆα i .Since ( ¯d 0 , . . . , ¯d i ) in ∂U(h i (v 0 , . . . , v i )) we havev i + ˆα i ≥ v i + ¯d 0 α 0 + · · · + ¯d i−1 α i−1 + ¯d i ˆα i=⇒ ˆα i (1 − ¯d i ) ≥ ¯d 0 α 0 + · · · + ¯dα i−1=⇒ (U ◦ h i−1 ) (v 0 + α 0 , . . . , v i−1 + α i−1 ) − (U ◦ h i−1 ) (v 0 , . . . , v i−1 ) ≥ ( ¯d0 α 0 + · · · + ¯d) 1i−1 α i−1(1 − ¯d i )=⇒ ( ¯d 0 , . . . , ¯d 1i−1 )(1 − ¯d i ) ∈ ∂ (U ◦ h i−1) (v 0 , . . . , v i−1 )We conclude for (d 0 , . . . , d i−1 ) = ( ¯d 0 , . . . , ¯d i )1(1− ¯d i ) ∈ ∂ (U ◦ h i−1) (v 0 , . . . , v i−1 )and ( ˜d 0 , ˜d 1 ) = ( (1 − ¯d i ), ¯d i)∈ ∂F ((U ◦ hi−1 ) (v 0 , . . . , v i−1 ), v i ) that ( ¯d 0 , ¯d 1 , . . . ¯d i−1 , ¯d i ) =(d 0 · ˜d 0 , d 1 · ˜d 0 , . . . , d i−1 · ˜d 0 , ˜d 1 ).We may now deduce inductively ∂U(h i (v 0 , . . . , v i )). For the functions k i definedby (6) we have{ }1∂U(h 0 (v 0 )) =1 + k 0 l 0thereforeand∂U(h 2 (v 0 , v 1 , v 2 )) =∂U(h 1 (v 0 , v 1 )) =(k2 l 21 + l 2 k 2·and inductively, (5) is obtained.∂h i (v 0 , v 1 , . . . , v i ) =⎛⎝ 1U ′ (ω i )⎛⎝(k1 l 11 + l 1 k 1·k 1 l 11 + l 1 k 1·i∏t=j+111 + k 0 l 0,11 + k 0 l 0,)11 + l 1 k 1k 2 l 21 + l 2 k 2·11 + k 1 l 1,)11 + l 2 k 2⎞⎞ik t (x 0 , x 1 , . . . , x t )l t⎠1⎠1 + k t (x 0 , x 1 , . . . , x t )l t 1 + k j (x 0 , x 1 , . . . , x j )l jj=0(18)29

7.4 Proof of property 3We assume that the principal is looking for optimal utility provisions v = (v i (·)) T i=0in the function spaces ( L 1 ([x 0 , ¯x 0 ] × [x 1 , ¯x 1 ] × · · · × [x i , ¯x i ]) ) T. Consider the followingfunctionsi=0T∑f 0 (v 0 , v 1 , . . . , v T ) = δ i E (x i − h i (v 0 (x 0 ), v 1 (x 0 , x 1 ), . . . , v i (x 0 , x 1 , . . . , x i ))|a 0 , a 1 , . . . , a i )i=0g 0 (v 0 , v 1 , . . . , v T ) =T∑δ i (E (v i (x 0 , x 1 , . . . , x i )|a 0 , a 1 , . . . , a i ) − ψ i (a i )) − U ∗i=0h i (v 0 , v 1 , . . . , v T ) =T∑δ j (∆ ai E (v j (x 0 , x 1 , . . . , x j )|a i , . . . , a j )) − ∆ϕ aj=iFor ũ 0 , u 0 ∈ R, u i : R i → R ∈ L 1 (R i ) for i ∈ {1, . . . , T } endowed with themeasure µ i induced by the outcome probability. Let u = (ũ 0 , u 0 , u 1 , . . . u T ) ∈U = R 2 × L 1 (R) × L 1 (R 2 ) × · · · × L 1 (R T ) we define{f 0 (v 0 , v 1 , . . . , v T ) if g 0 (v 0 , v 1 , . . . , v T ) ≥ ũ 0 , h i (v 0 , v 1 , . . . , v T ) ≥ u i (x 0 , x 1 , . . . , x i−1 )F (v, u) =−∞otherwise.−F (v, u) is closed in u since the sets {u|F (v, u) ≥ α} are closed for all α ∈ Rby continuity of f 0 , g 0 and h i for i ∈ {0, . . . , T }. −F (v, u) is also convex in u. 17Following ?, equation 4.2, the Lagrangian function K is defined asK(v, y) = sup{F (v, u)+ < u, y > |u ∈ U}with y = (ỹ 0 , y 0 , y 1 , . . . , y T ) ∈ Y = R 2 × (L ∞ ) T +1 and< u, y >= ỹ 0 ũ 0 + y 0 u 0 + E µ 1(y 1 u 1 ) + E µ 2(y 2 u 2 ) + . . . + E µ T (y T u T )In our case K is equal to{f 0 + g 0 ỹ 0 + h 0 y 0 + EK(·, y) =µ 1(y 1 h 1 ) + E µ 2(y 2 h 2 ) + . . . + E µ T (y T h T ) if y ≥ 0+∞ if y < 0Where we say y ≥ 0 if all components are positive almost everywhere. Lety = (λ, µ 0 , . . . , µ T ), the Lagrangian becomes.T∑K(v, y) = δ i E (x i − h i (v 0 (x 0 ), v 1 (x 0 , x 1 ), . . . , v i (x 0 , x 1 , . . . , x i ))|a 0 , a 1 , . . . , a i ) +i=0( T)∑λ δ i (E (v i (x 0 , x 1 , . . . , x i )|a 0 , a 1 , . . . , a i ) − ψ i (a i )) − U ∗ +i=0⎛⎞T∑ T∑⎝ δ j (∆ ai E (µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )|a 0 , a 1 . . . , a j )) ⎠i=0j=i17 We need to check F (v, u) is concave in u. We need to check that for u 1 , u 2 ∈ U, if F (x, u 1 ) ≠ −∞,F (x, u 2 ) ≠ −∞ then F (x, αu 1 + (1 − α)u 2 ) ≠ ∞. This follows since αu 1 j + (1 − α)u2 j ≥ min{u1 j , u2 j }for j ∈ {1, . . . , T + 2}.30

Where we denoteˆ∆ ai E (µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )|a 0 , a 1 , · · · a i , · · · , a j ) =E (µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )|a 0 , a 1 , · · · , a i = a H, · · · , a j ) +−E (µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )|a 0 , a 1 , · · · , a i = a L , · · · , a j ) =µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )f 0 (x 0 |a 0 ) · · · fa i i(x i |a i ) · · · f j (x j |a j )dx 0 · · · dx i · · · dx jWe say that y belongs to the subgradient set of a function ϕ : V → R at v ∈ V ,which we denote y ∈ ∂ϕ(v) ifϕ(v ′ ) ≥ ϕ(v)+ < v ′ − v, y >∀v ′ ∈ UWe say that (¯v, ȳ) satisfies the Kuhn-Tucker condition if 0 ∈ ∂ v (−K(¯v, ȳ)) and0 ∈ ∂ y (−K(¯v, ȳ)) .Since −F (v, u) is closed and convex in u from Theorem 15 in ? we knowthat if (¯v, ȳ) satisfies the Kuhn-Tucker condition then ¯v solves the principal problemgiven by equations (3) through (4). Note that 0 ∈ ∂ y K(¯v, ȳ) if and only ifK(¯v, y ′ ) ≥ K(¯v, ȳ) ∀y ′ ∈ Y which is equivalent to ask that the constraints (PC’)through (4) be satisfied. If one of the constraints is not satisfied at ¯v then K(¯v, ·)is unbounded. Thus we only need to verify that 0 ∈ ∂ y K(¯v, ȳ).K(v, y), although non-differentiable, is concave in v = (v i (·)) iand the set ofconstraints is convex (Property 1), therefore a necessary and sufficient conditionfor a wage schedule to be optimal is that the subgradient set of −K(v, y) (denoted∂ (−K(v, y))) contains 0. Since from Property 1 we have that E (h i (v 0 (x 0 ), v 1 (x 0 , x 1 ), . . . , v i (x 0 , x 1 , . . . , xare convex in (v i (·)) i, from proposition 4.2.4 in ?. 18∂ (−K(v, y)) =T∑δ i ∂E (h i (v 0 (x 0 ), v 1 (x 0 , x 1 ), . . . , v i (x 0 , x 1 , . . . , x i ))|a 0 , a 1 , . . . , a i ) +i=0( T)∑− λ δ i ∂E (v i (x 0 , x 1 , . . . , x i )|a 0 , a 1 , . . . , a i ) +−i=0⎛⎞T∑ T∑⎝ δ j ∂ (∆ ai E (µ i (x 0 , x 1 , . . . , x i−1 ) · v j (x 0 , x 1 , . . . , x j )|a i , . . . , a j )) ⎠i=0j=iFrom Theorem 22 of ?, we know that a subgradient set of −K is the expectationof the subgradient set of the integrand.18Note that the subgradient set of a constant function is equals to zero31

T∑∂ (−K(v, y)) = δ i E (∂h i (v 0 (x 0 ), v 1 (x 0 , x 1 ), . . . , v i (x 0 , x 1 , . . . , x i ))|a 0 , a 1 , . . . , a i ) +i=0( T)∑− λ δ i E (∂v i (x 0 , x 1 , . . . , x i )|a 0 , a 1 , . . . , a i ) +−i=0⎛⎞T∑ T∑⎝ δ j (∆ ai E (µ i (x 0 , x 1 , . . . , x i−1 ) · ∂v j (x 0 , x 1 , . . . , x j )|a i , . . . , a j )) ⎠i=0j=iTherefore from Property 2 making ∂ (−K(v, y)) equal 0 by components correspondsto,⎛⎛⎞⎞T∑0 = δ i−j E ⎝1i∏⎝k t(x 0 , x 1 , . . . , x t)l t ⎠1∣Ui=j′ · g(x 0 , x 1 , . . . , x j ) ∣a 0 , a 1 , . . . , a i ⎠ +(ω i (x 0 , x 1 , . . . , x i )) 1 + kt=j+1 t(x 0 , x 1 , . . . , x t)l t 1 + k j (x 0 , x 1 , . . . , x j )l j− λE ( g(x 0 , x 1 , . . . , x j ) ∣ ∣ a0 , a 1 , . . . , a j)+−j∑ (∆ai E ( µ i (x 0 , x 1 , . . . , x i−1 )g(x 0 , x 1 , . . . , x j ) ∣ ))∣a 0 , a 1 , . . . , a ji=0for every g ∈ L 1 ( [x 0 , ¯x 0 ] × [x 1 , ¯x 1 ] × · · · × [x j , ¯x j ] ) , which implies⎛ ⎛1U ′ (ω j ) · 1T∑+ δ i−j E ⎝ 1 i∏⎝1 + k j l j Ui=j+1′ (ω i )t=j+1⎞k tl t ⎠1 + k tl t⎞1∣ ∣∣aj+1, . . . , a i ⎠ = λ +1 + k j l jj∑i=0fa i µ i(x i |a i )if i (19)(x i |a i )= λ j + µ jf j a j(x j |a j )f j (x j |a j )For every j ∈ {1, . . . , T }. Multiplying the equation for j + 1 by k j+1 l j+1 takingexpectation with respect to f j+1 (x i+1 |a j+1 ) we obtain(1EU ′ (ω j+1 )k j+1 l∣ )j+1 ∣∣aj+11 + k j+1 l j+1= −ET∑i=j+2((δ i−j−1 E⎛⎝ 1U ′ (ω i )⎛⎝i∏t=j+2fa j+1λ j+1 + µj+1(x j+1 |a j+1 )j+1f j+1 (x j+1 |a j+1 )⎞⎞k t l t⎠ k ∣j+1l j+1 ∣∣aj+1, . . . , a i⎠1 + k t l t 1 + k j+1 l j+1) )∣∣∣aj+1k j+1 l j+1Replacing this last expression in the j + 1 term of the sum in (19), 7 and 8 areobtained.7.5 Proof of property 4The payment scheme must be continuous and non-decreasing. In fact, the multipliersµ i are strictly positive. If not, then the payment scheme would not depend.It can ( be seen that the right ) side of (24) has slope with respect to x i of at leastdfa dx i(1 − δl i+1 )µ i i(x i |a i )i f i (x i |a iand therefore, ω)i must be non-decreasing. A flat segmentwill arise whenever the payment scheme reaches the first period referencelevel. (10) and (8) imply that, as x i increases and reaches the reference if the32

scheme were to continue increasing it would enter the gain area and the rightside of (10) and (8) would jump downwards, therefore contradicting that it increasedafter reaching the reference income. Something analogous happens whenthe reference level is reached from above (as x i decreases), if the optimal schemewere to go below the reference, the optimality characterization would require itto jump upwards. This contradicts that it decreased after reaching the referencefrom above.Now, let’s see whether the reference will be reached. The following equalitymust be fulfilled,(1U ′ (ω i (x 0 , x 1 , . . . , x i )) = (1 + k f i )ai(x 0 , x 1 , . . . , x i )l i ) λ i + µ i(x i |a i )if i +(x i |a i )ˆfa i+1− δl i+1 k i+1 (x 0 , x 1 , . . . , x i+1 )(λ i+1 + µi+1(x i+1 |a i+1 )i+1f i+1 (x i+1 |a i+1 ) )f i (x i+1 |a i+1 )dx i+1 . (20)The i − 1 period’s payments fulfills the following equation,λ i (1 + k i−1 (x 0 , x 1 , . . . , x i−1 )l i−1 ) =1U ′ (ω i−1 (x 0 , x 1 , . . . , x i−1 )) +ˆδl i k i (x 0 , . . . , x i )(λ i + µ if i a i(x i |a i )f i (x i |a i ))f i (x i |a i )dx i .Suppose ω i (x 0 , x 1 , . . . , x i ) < ω i−1 (x 0 , x 1 , . . . , x i−1 ) ∀x i (except in one point).1Then k i (x 0 , x 1 , . . . , x i ) = 1 ∀x i and (??) becomes λ i =U ′ (ω i−1 )(1+k i−1 l i−1 −δl i ) .´ fa iTherefore, since δl i+1 ki+1 (x 0 , x 1 , . . . , x i+1 )(λ i+1 +µ i+1(x i+1 |a i+1 )i+1 f i (x i+1 |a i+1)f i (x) i+1 |a i+1 )dx i+1 ≤δl i+1 λ i+1 we obtain(1U ′ (ω i ) ≥ (1 + l i − δl i+1 )1U ′ (ω i−1 )(1 + k i−1 l i−1 − δl i ) + µ ifa i )i(x i |a i )f i (x i |a i )Therefore, if (l i − l i+1 )δ ≥ k i−1 l i−1 − l i and f a i i(x i |a i )f i (x i |a i> 0 we conclude)1U ′ (ω i ) ≥ 1U ′ (ω i−1 ) which contradicts ω i < ω i−1 . If l i−1 = l i , (l i − l i+1 )δ ≥ l i−1 − l ior k i−1 = 0 then (l i − l i+1 )δ ≥ k i−1 l i−1 − l i will be fulfilled.Suppose ω i (x 0 , x 1 , . . . , x i ) > ω i−1 (x 0 , x 1 , . . . , x i−1 ) ∀x i (except in one point).1Then k i (x 0 , x 1 , . . . , x i ) = 0 ∀x i and (??) becomes λ i =U ′ (ω i−1 )(1+k i−1 l i−1 ) .Therefore, we obtain(1U ′ (ω i ) ≤ 1U ′ (ω i−1 )(1 + k i−1 l i−1 ) + µ f i )a i(x i |a i )if i (x i |a i )Therefore, if f a i i(x i |a i )f i (x i |a i ) < 0 we conclude 1U ′ (ω i ) ≤ 1U ′ (ω i−1 )for which contradictsω i > ω i−1 . We conclude that the reference must be reached on an interval for allthe cases stated above.33

7.6 Proof of property 7The marginal utility of saving in period i at rate r l and consuming the savings inperiod i + 1 is given by,ˆ−(1 + 1 {ωi≤R i}l i )U ′ (ω i ) + δ(1 + r l ) (1 + l i+1 1 {ωi+1

We must have ´ U ′ (ω i+1 (x i+1 ))f i+1 (x i+1 |a i+1 )dx i+1 > U ′ (ω i ), therefore, ifω i > R i or l i is sufficiently small then the marginal utility would be positive andtherefore the agent will have incentives to save. The marginal utility of borrowingis,ˆ(1 + 1 {ωi

1(U ′ (ω 1 (x 0 , x 1 )) (1 + δP(ω 2 = ω 1 (x 0 , x 1 )|a 2 )) = 1+k 1 (x 1 )l 1 +δP(ω 1 (x 0 , x 1 ) = ω 2 |a 2 )+ˆˆ− δl 1 P(ω 1 (x 0 , x 1 ) > ω 2 ))λ 2 − δµ 2 l 2 f a2 (x 2 |a 2 )dx 2 + µ 1 δ f a1 (x 1 |a 1 )dx 1ω 2 ω 1 |a 1 ) − δ 2 fa 0 l 2 P(ω 0 (x 0 ) > ω 1 |a 1 )P(ω 0 (x 0 ) > ω 2 |a 2 )) λ + µ 0(x 0 |a 0 )0f 0 +(x 0 |a 0 )ˆ()+ −δ 2 fa 1 l 2 µ 1(x 1 |a 1 )1ω 1 =ω 0 ,ω 2