12.07.2015 Views

0 Outline Content 1 Motivating Examples

0 Outline Content 1 Motivating Examples

0 Outline Content 1 Motivating Examples

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MA6000D: Stoch. Opt. Control — <strong>Motivating</strong> <strong>Examples</strong> Amg Cox 10 <strong>Outline</strong> <strong>Content</strong><strong>Motivating</strong> examples: simple control problem (Cramer?), quadratic regulator, utility maximisation,option pricing. (1 hour)Technical Background: Stochastic integral, basic properties, Itô’s Formula, martingalerepresentation theorem, SDEs, weak & strong solutions, Diffusions: strong Markov property,Generators, Dynkin’s formula, Feynman-Kac formula. Girsanov’s theorem. Dirichlet-Poisson problem. (5 hours)Stochastic Optimal Control ‘Theory’: Problem statement, Markov controls, value function,dynamic programming principle(?), characterisation of an optimal control — HJBequation, verification theorem, ‘typical’ approach: use intuition and HJB to guess a solution,then verify. (e.g. Oksendal, Ex. 11.2.5). (2 hours)Applications: Option pricing: complete markets, trading and arbitrage, fundamental theorem,incomplete markets — upper and lower hedging price, dual representation. Utilityindifference pricing. (4 hours) American options?Utility maximisation: investment and consumption, Merton problem, dual representationof solutions, maximisation of growth rate(?), transaction costs (Davis & Norman). (4hours)Utility maximisation — inverse problem? (2 hours)1 <strong>Motivating</strong> <strong>Examples</strong>Example 1.1. Consider a company whose assets are valued according to a Brownian motionwith diffusion coefficient σ and unit drift, and who can pay their shareholders dividends,so long as the company is not bankrupt (i.e. debts larger than assets). If, at time 0, thenet value of the company is x, and they pay dividends at rate u t at time t, the companyvalue at time t is:X u t= x + σB t −= x + σB t +∫ t0∫ t0u s ds + t(1 − u s ) ds.Adjusting for interest, the shareholders wish to maximise their average payout:[∫ τ u ]E e −rt u t dt0where r > 0 is the interest rate, and τ u = inf{t ≥ 0 : Xtu = 0} is the time of bankruptcy.What should the company do — i.e. how should they choose the dividend payments —given that they are restricted to choosing u s ∈ [0, 1] := U?Notes:Please e-mail any comments/corrections/questions to: a.m.g.cox@bath.ac.uk or poston the discussion forum.


MA6000D: Stoch. Opt. Control — <strong>Motivating</strong> <strong>Examples</strong> Amg Cox 3Assuming some sort of Markovianity, this suggests thatV ∗ (X u t∧τ u, t ∧ τ u ) +∫ t∧τ u0e −rs u s ds is a supermartingale for all u ∈ U. (1)Moreover, we should have equality throughout if we can find a strategy u ∗ ∈ U which isoptimal, which suggests thatV ∗ (X u∗t∧τ u, t ∧ τ u ) +∫ t∧τ u0e −rs u ∗ s ds is a martingale for some u ∗ ∈ U. (2)This principle of representing the value function for the problem in terms of itself is knownas the Bellman principle, or the principle of dynamic programming.Let us now try to solve the problem based on these ideas, and assuming everything is‘well-behaved’.First, let’s suppose that the value function exists, and is differentiable. Then (and this isessentially an argument that we will formalise as Itô’s Lemma) by Taylor’s Theorem:V ∗ (Xt+δt, u t + δt) − V ∗ (Xt u , t) ≈ ∂V ∗∂x (Xu t+δt − Xt u ) + ∂V ∗∂t δt + 1 ∂ 2 V ∗2 ∂x 2 (Xu t+δt − Xt u ) 2≈ ∂V ∗∂x (σ (B t+δt − B t ) + δt(1 − u t )) + ∂V ∗∂t δt+ 1 ∂ 2 V ∗2 ∂x (σ (B 2 t+δt − B t ) + δt(1 − u t )) 2≈ σ ∂V ∗∂x δB t +((1 − u t ) ∂V ∗∂x + ∂V )∗δt + 1 ∂ 2 V ∗∂t 2 ∂x 2 σ2 (δB t ) 2 + h.o.t.But E(δB t ) 2 = δt, and in fact, for small δt, we will be able to justify the substitution(δB t ) 2 = δt to getV ∗ (Xt+δt, u t + δt) − V ∗ (Xt u , t) ≈ σ ∂V ∗∂x δB t +((1 − u t ) ∂V ∗∂x + ∂V ∗+ 1 )∂2 V ∗∂t 2 σ2 δt.∂x 2Since B t is zero on average, the (super-)martingale conditions (1) and (2) suggest we need:(1 − u t ) ∂V ∗∂x + ∂V ∗+ 1 ∂2 V ∗∂t 2 σ2 ≤ −e −rt u∂x 2 t (3)for all u t ∈ U. Recalling that our behaviour should (intuitively) be independent of time,we conjecture that in fact V ∗ (x, t) = e −rt V (x), for some function V . Then (3) implies:(1 − u t )V x (x) − rV + 1 2 σ2 V xx ≤ −u t ,for all u t ∈ U, with equality for some ut ∈ U, or equivalently:V x + 1 2 σ2 V xx − rV =inf [w(V x(x) − 1)] = −(1 − V x (x)) + . (4)w∈[0,1]Note that the fact that we only need consider u = 0 or u = 1 suggests that the optimalbehaviour will only be not to pay dividends, or to pay as much as possible. There is noPlease e-mail any comments/corrections/questions to: a.m.g.cox@bath.ac.uk or poston the discussion forum.


MA6000D: Stoch. Opt. Control — <strong>Motivating</strong> <strong>Examples</strong> Amg Cox 4(substantial) middle ground. We can also try to consider some properties of the solution.Clearly, we expect V (x) ↓ 0 as x ↓ 0, and also V (x) → r −1 as x → ∞, since u t ≤ 1always, and as x → ∞, we can keep paying out at rate 1 for arbitrarily long on average(this could be proved using properties of Brownian motion). Also, we are better off if westart with bigger x, so V x (x) ≥ 0.Suppose that we can ignore the (·) + , so that we want to solveV x + 1 2 σ2 V xx − rV = V x (x) − 1,√ √then this has solutions of the form V (x) = r −1 + αe − 2rσ 2 x 2r+ βeσ 2 x . Assuming that,for large x, ( we want V (x) → r −1 , we must have β = 0. In fact, if we now choose√ )V (x) = r −1 1 − e − 2r√σ 2 x , we see also that V (0) = 0, and V x (x) ≤ r −1 2r. In particular,σ 2if the parameters are such that 2σ 2≤ r, then V x (x) ≤ 1 always, and (4) holds.Now, going back through this argument, with a little bit of extra machinery, we can checkthat (1) and (2) hold for V ∗ (x, t) = e −rt V (x), with u ∗ t ≡ 1 for all t ≥ 0, and hence that:[∫ ]t∧τ u ∗V ∗ (x, 0) = E (x,0) V ∗ (X u∗t∧τ u∗, t ∧ τ u∗ ) + e −rs u ∗ s ds→ E (x,0) [ ∫ τ u ∗0e −rs u ∗ s dsas t → ∞, by bounded convergence, the fact that τ u∗ < ∞ a.s. and monotone convergence.Using the super-martingale property, we can show a similar inequality when the RHSinvolves an arbitrary u rather than u ∗ . Hence we have found the optimal u ∗ and thecorresponding value, V ∗ .In the case where r < 2 , things are a bit different. Essentially, what we want to doσ 2is combine solutions to the two equations. We guess that there is some x 0 such thatV x (x) ≥ 1 on [0, x 0 ] and V x (x) ≤ 1 on [x 0 , ∞). We then solve V x (x) on [0, ∞) withthe boundary conditions V (0) = 0, V (∞) = r −1 and (let’s say) with the additional conditionthat V ′ (x 0 +) = 1, and V (x 0 −) = V (x 0 +), which is certainly necessary for themartingale/supermartingale conditions to hold. Together, for fixed x 0 , these are enoughto specify a unique solution to the pair of equations, but in general, we will not haveV ′ (x 0 −) = 1, and this will cause difficulty in the martingale/supermartingale conditions.To rectify this, we can try to vary x 0 (the remaining degree of freedom) to get V ′ (x 0 −) = 1,and (hopefully, if this is the right general solution), we will see that the resulting solutionis of this form: particularly, the resulting V (Xt u , t) + ∫ t0 e−rs u s ds is a supermartingale upto τ 0 , and a martingale under the optimal control. Once we have a candidate solution,then this is normally not too hard to check.Notes:]0• The procedure here is a sort of informed ‘guess and verify.’ We have used a mixof martingale-type properties, specific intuition about the problem, and informedPlease e-mail any comments/corrections/questions to: a.m.g.cox@bath.ac.uk or poston the discussion forum.


MA6000D: Stoch. Opt. Control — <strong>Motivating</strong> <strong>Examples</strong> Amg Cox 5guessing to find a candidate value function. Once we have this, then we are able toshow that it is optimal via a martingale argument, but we need to have a specificguess first.• Note that our final proofs of optimality do not require any Markovianity of a suboptimalstrategy, they will hold for essentially any choice of the function u.• In the more complicated case, even in this relatively easy problem, finding the actualsolution is not straightforward, and would in practice probably need a computer orsimilar. This is a fairly common situation — our answers are very likely to be onlypartially computable, but we can at least derive some qualitative features of thesolution without calculation (that u(X t ) = 1 Xt≥α for some constant α).• In this example, the final value function is nice, but candidate value functions maynot be — in particular, they may not be nice and differentiable, or bounded, etc., andin fact, even for nice problems, the value function is often not suitably differentiable.This can cause substantial technical difficulties.• In this case, our choice of control only affected the drift of our process. In manycircumstances, we may want to affect both the drift and the diffusion coefficient.Example 1.2 (Stochastic Linear-Quadratic Regulator). Consider a stochastic process whosechange over a small time is given by:dX u t= (αX u t + u t ) dt + σ dB tso if α = 0, we just have Xtu = X 0 + ∫ tu 0 s ds + σB t , but we also allow a more complicatedlocal change which can depend on the process itself.Consider the problem of minimising[∫ T]E β(Xt u ) 2 + γu 2 t dt + ξ(XT u ) 2 .0Clearly, we should push towards the origin, but the question is how fast? In fact, theoptimal control in this problem is independent of σ, so this problem can be used toinvestigate the impact of noise in controlled systems.Example 1.3 (Utility Maximisation). Consider the problem of an investor who wishesto maximise their wealth at retirement (say) but is risk-averse. They might choose tomaximise:EU (X T )where U(x) is a function known as the utility function, and is generally supposed tobe concave and increasing, and (X t ) t∈[0,T ] is the investor’s wealth at time T . Note thatconcavity implies that U(EX) ≥ EU(X) for all random variables X, and this has aneconomic interpretation of risk-aversion.The investor can affect X t by choosing how to invest: suppose that she can invest someproportion of her wealth in the stock-market, and some proportion in a risk-free bankaccount, where she receives a fixed interest rate r. If the stock-market follows a GeometricBrownian motion (Black-Scholes model) then S t = S 0 exp µt + σB t and we have therepresentation:dS t = S t µ dt + S t σ dB t .Please e-mail any comments/corrections/questions to: a.m.g.cox@bath.ac.uk or poston the discussion forum.


MA6000D: Stoch. Opt. Control — <strong>Motivating</strong> <strong>Examples</strong> Amg Cox 6If we invest an amount u t in the stock market at time t (so we have u t /S t units of thestock, and X t − u t in the bank), over a small time period, our wealth changes by:dX t = u tS tdS t + r(X t − u t ) dt= rX t dt + u t (µ − r) dt + u t σ dB t . (5)A natural problem (the utility maximisation problem) is then:supu tEU(X u T )where the control now affects the reward solely through the final value of wealth.Depending on the situation, we may allow u t < 0 (short selling), or require u t ≥ 0, orinclude other constraints, such as requiring our wealth X t to remain positive, or abovesome fixed level −α. In addition, we could include the option for the investor to consumetheir wealth before time t, and we then need to optimise over both the investment andthe consumption strategy.Example 1.4 (Option Pricing). Finally, consider the problem of a bank who has sold aderivative contract: that is, if S t is the price of some financial asset, they have contractedto pay another party an amount f(S t ) at some future date. To reduce their risk, theywish to super-hedge their exposure: that is, they want to find a trading strategy u t andan initial wealth x 0 such that if X u 0 = x 0 , and (5) holds, then X u T ≥ f(S t) almost surely.Solving this problem does not really need the full stochastic control machinery (as we willsee, it can be handled in a slightly simpler manner), but it has much of the ‘flavour’ ofa stochastic control problem, and can be generalised to some natural stochastic controlproblems: consider the problem{}UIP (f) = inf p ∈ R : sup EU (XT u + (p − f(S T ))) ≥ sup EU (XT u )u tu tthat is, p is the smallest price at which I would prefer (greater utility) to sell the contractf(S T ) for price p, and hedge using the asset, than to simply invest without selling thecontract. UIP (f) is the Utility Indifference Price, and clearly the UIP will be belowthe superhedging price, but taking x 0 = 0 and U(x) = 1 {[0,∞)} (x), we can recover thesuperhedging price as the UIP price.Please e-mail any comments/corrections/questions to: a.m.g.cox@bath.ac.uk or poston the discussion forum.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!