# A cookbook, not a panacea, for claims reserving A cookbook, not a panacea, for claims reserving

A cookbook, nota panacea, for claimsreservingRezervovacia kuchárka (GLM ako aperitív, hlavné menu po bayesovsky, GEE ako digestív)Michal PeštaCharles University in PragueFaculty of Mathematics and PhysicsActuarial Seminar, Prague12 October 2012

Overview- Chain ladder models- GLM in stochastic reserving- Nonparametric smoothing models in reserving- Bayesian reserving models- GEE in stochastic reserving• Support:Czech Science Foundation project “DYME DynamicModels in Economics” No. P402/12/G097

Terminology- X i,j . . . claim amounts in development year j withaccident year i- X i,j stands for the incremental claims in accidentyear i made in accounting year i + j- n . . . current year – corresponds to the mostrecent accident year and development period- Our data history consists ofright-angled isosceles triangles X i,j , wherei = 1, . . . , n and j = 1, . . . , n + 1 − i

Run-off (incremental)triangleAccidentDevelopment year jyear i 1 2 · · · n − 1 n1 X 1,1 X 1,2 · · · X 1,n−1 X 1,n2 X 2,1 X 2,2 · · · X 2,n−1. ..... X i,n+1−in − 1 X n−1,1 X n−1,2n X n,1

Notation- C i,j . . . cumulative payments in origin year i after jdevelopment periodsC i,j =j∑k=1X i,k- C i,j . . . a random variable of which we have anobservation if i + j ≤ n + 1- Aim is to estimate the ultimate claims amount C i,nand the outstanding claims reserveR i = C i,n − C i,n+1−i ,i = 2, . . . , nby completing the triangle into a square

Run-off (cumulative)triangleAccidentDevelopment year jyear i 1 2 · · · n − 1 n1 C 1,1 C 1,2 · · · C 1,n−1 C 1,n2 C 2,1 C 2,2 · · · C 2,n−1. ..... C i,n+1−in − 1 C n−1,1 C n−1,2n C n,1

Chain ladder E[C i,j+1 |C i,1 , . . . , C i,j ] = f j C i,j Var[C i,j+1 |C i,1 , . . . , C i,j ] = σ 2 j Cα i,j , α ∈ R Accident years [C i,1 , . . . , C i,n ] are independentvectors

Development factors(link ratios) f j∑ n−ĵf (n)j=i=1 C1−α i,jC i,j+1∑ n−ji=1 C2−α i,ĵf (n)n ≡ 1 (assuming no tail), 1 ≤ j ≤ n − 1

Mack or linearregression- α = 0 . . . linear regression (no intercept,homoscedastic) for [C •,j , C •,j+1 ] satisfies CL- α = 1 . . . Mack (1993), but also the Aitken (nointercept, heteroscedastic) regression model withweights C −1i,j- smoothing (and extrapolation) of developmentfactors possible

Ultimates and reserves- Ultimate claims amounts C i,n are estimated byĈ i,n = C i,n+1−i ×(n)(n)̂fn+1−i× · · · × ̂f n−1- Reserves R i are, thus, estimated by( )̂R i = Ĉi,n(n)(n)− C i,n+1−i = C i,n+1−i ̂fn+1−i× · · · × ̂f n−1 − 1

Generalized LinearModels- a flexible generalization of ordinarylinear regression- formulated by John Nelder and RobertWedderburn as a way of unifying various otherstatistical models, including linear regression,logistic regression and Poisson regression

GLM: 3 elements1. random component: outcome of thedependent variables Y from the exponential family,i.e.,f Y (y; θ, φ) = exp {[yθ − b(θ)]/a(φ) + c(y, φ)}where θ is canonical parameter, φ isdispersion parameter and EY i = µ i2. systematic component: linear predictor (meanstructure)η = Xβ3. link: function gη i = g(µ i )

Exponential family- include many of the most common distributions,including the normal, exponential, gamma,chi-squared, beta, Dirichlet, Bernoulli, categorical,Poisson, Wishart, Inverse Wishart and manyothers- a number of common distributions are exponentialfamilies only when certain parameters areconsidered fixed and known, e.g., binomial (withfixed number of trials), multinomial (with fixednumber of trials), and negative binomial (with fixednumber of failures)- common distributions that arenot exponential families are Student’s t, mostmixture distributions, and even the family ofuniform distributions with unknown bounds

Canonical link –sufficient statistic- exponential familyEY = µ = b ′ (θ),VarY = b ′′ (θ)a(φ) ≡ V (µ)ã(φ)- distribution ←→ link function (sufficient statistic←→ canonical link)◮ normal . . . identity: µ i = X i,• β◮ gamma (exponential) . . . inverse (reciprocal):µ −1i = X i,• β◮ Poisson . . . logarithm: log(µi ) = X i,• β( )◮ binomial (multinomial) . . . logit: logµi1−µ i= X i,• β◮ inverse Gaussian . . . reciprocal squared: µ−2i = X i,• β

Link functions- logit η = log{µ/(1 − µ)}- probit η = Φ −1 (µ)- complementary log-log η = log{− log(1 − µ)}- power family of linksη ={ (µ λ − 1)/λ, λ ≠ 0,log µ, λ = 0;or η ={ µ λ , λ ≠ 0,log µ, λ = 0.

Estimation- estimation of the parameters viamaximum likelihood, quasi-likelihood orBayesian techniques

N(µ, σ 2 )- support (−∞, +∞)- dispersion parameter φ = σ 2- cumulant function b(θ) = θ 2 /2( )- c(y, φ) = − 1 y 22 φ + log(2πφ)- µ(θ) = E θ Y = θ- canonical link θ(µ): identity- variance function V (µ) = 1

P o(µ)- support {0, 1, 2, . . .}- dispersion parameter φ = 1- cumulant function b(θ) = exp{θ}- c(y, φ) = − log y!- µ(θ) = E θ Y = exp{θ}- canonical link θ(µ): log- variance function V (µ) = µ

Γ(µ, ν)- support (0, +∞)- VarY = µ 2 ν- dispersion parameter φ = ν −1- cumulant function b(θ) = − log{−θ}- c(y, φ) = ν log(νy) − log y − log Γ(ν)- µ(θ) = E θ Y = −1/θ- canonical link θ(µ): reciprocal- variance function V (µ) = µ 2

Mack’s model as GLM- reformulate Mack’s model as a model of ratios[ ][ ∣ ]Ci,j+1Ci,j+1 ∣∣∣E = f j and Var C i,1 , . . . , C i,j = σ2 jC i,j C i,j C i,j- conditional weighted normal GLM( )C i,j+1∼ N f j , σ2 jC i,jC i,j- Mack’s model was not derived/designed as a GLM,but a conditional weighted normal GLM gives thesame estimates

GLM for triangles- independent incremental claims X ij , i + j ≤ n + 1◮ overdispersed Poisson distributed XijE[X ij ] = µ ij and Var[X ij ] = φµ ij◮Gamma distributed X ij- logarithmic link functionE[X ij ] = µ ij and Var[X ij ] = φµ 2 ijlog(µ ij ) = γ + α i + β j , α 1 = β 1 = 0

GLM for triangles II- overdispersed Poisson with log link providesasymptotically same parameter estimates, predictedvalues and prediction errors- possible extensions:◮ Hoerl curvelog(µ ij ) = γ + α i + β j log(j) + δ j j◮smoother (semiparametric)log(µ ij ) = γ + α i + s 1 (log(j)) + s 2 (j)

Estimation in triangles- ML (maximum likelihood) . . . likelihoodL(θ, φ; X) =n∏i=1n+1−i∏j=1f(X ij ; θ ij , φ)- maximize log-likelihood w.r.t. parameters of µ,which is an argument of θ, i.e., θ(µ(α, β))! there is no overdispersed Poisson distribution(only if thinking of negative binomial)- QML (quasi-maximum likelihood). . . quasi-likelihoodfor ODPlog Q(µ; X) =n∑i=1n+1−i∑j=1φ(X ij log µ ij − µ ij ) + const

Generalized additivemodels- GAM . . . extension of GLM, with the linearpredictor being replaced by a non-parametricsmootherp∑η ij = s k (X ij )k=1- s(x) represents a non-parametric smoother on x,which may be chosen from several different typesof smoother, such as locally weighted regressionsmoothers (loess), cubic smoothing splines andkernel smoothers

GAM- Ex: smoothing (trade-off between smoothnessand fit) for univariate (p = 1) cubic spline withnormal distribution⎧⎫⎨∑∫ ⎬min [X ij − s(X ij )] 2 + λ [s ′′ (t)] 2 dt⎩⎭i,j

Bayesian approach- problem of instability in the proportion ofultimate claims paid in the early development years,causing a method such as the CL to produceunsatisfactory results when applied mechanically- to stabilize the results using anexternal initial estimate of ultimate claims

Bornhuetter-Ferguson- reminder from CL: outstanding claimsR i = C i,n+1−i (f n+1−i × . . . × f n−1 − 1)Assumptions(1) E[C i,j+k |C i,1 , . . . , C i,j ] = C i,j + (β j+k − β j )µ i andE[C i,1 ] = β 1 µ iβ j > 0, µ i > 0, β n = 11 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k ≤ n − j(2) accident years [C i,1 , . . . , C i,n ], 1 ≤ i ≤ n areindependent- estimate ĈBF i,n for ultimates- Bayesian approach to CL

Bornhuetter-FergusonMethod- a very robust method since it does not consideroutliers in the observationsImplied Assumptions(1) E[C i,j ] = β j µ iβ j > 0, µ i > 0, β n = 11 ≤ i ≤ n, 1 ≤ j ≤ n(2) accident years [C i,1 , . . . , C i,n ], 1 ≤ i ≤ n areindependent- "implied"assumptions are weaker than the originalBF assumptions (and, hence, not equivalent)

BF Estimator- BF estimator of ultimate from the latestĈ BFi,n= C i,n−i+1 + (1 − ̂β n−i+1 )̂µ i- comparing CL and BF model ∏ n−1role of β j and, therefore,k=j f −1kplays thêβ j =n−1∏k=j1̂f k- need a prior estimate for µ i- ̂µ i is often a plan value from a strategic businessplan or the value used for premium calculations- ̂µ i should be estimated before one has anyobservations (i.e., should be a pure prior estimatebased on expert opinion) !

Comparison of BF andCL Estimators- if the prior estimate of ultimates µ i is equal tothe CL estimate of ultimates, then the BF and CLestimators underline- BF:- CL:Ĉ BFi,nĈ CLi,n= C i,n−i+1 + (1 − ̂β n−i+1 )̂µ i= C i,n−i+1 + (1 − ̂β n−i+1 )ĈCL i,n- BF differs from the CL in that the CL estimateof ultimate claims is replaced by an alternativeestimate based on external information andexpert judgement

Bayesian Models- a prior distribution for row (ultimate) parameter- Ex: ODP model, where an obvious candidate isµ i ∼ independent Γ(γ i , δ i )such thatEµ i = γ iδ i

Predictive distribution- posterior predictive distribution of incrementalclaims X i,j is an over-dispersednegative binomial distribution with mean[Z i,j C i,j−1 + (1 − Z i,j ) γ ]i 1(f j−1 − 1)δ i f j−1 × . . . × f n−1whereZ i,j =1f j−1 ×...×f n−1δ i φ +1f j−1 ×...×f n−1

Credibility formula- a natural trade-off between two competingestimates for X i,jC i,j−1andγ i 11= E[µ i ]δ i f j−1 × . . . × f n−1 f j−1 × . . . × f n−1- Bayesian model has the CL as one extreme(no prior information about the row parameters)and the BF as the other (perfect priorinformation about the row parameters)

Bayesian trade-off- BF assumes that there is perfect priorinformation about the row parameters (does notuse the data at all for one part of estimation). . . heroic assumption- prefer to use something between BF and CL- credibility factor Z i,j governs the trade-offbetween the prior mean and the data- the further through the development we are, the1largerf j−1 ×...×f n−1becomes, and the more weight isgiven to the CL ladder estimate- choice of δ i is governed by the prior precision ofthe initial estimate for ultimates . . . with regardgiven to the over-dispersion parameter (e.g.,an initial estimate from the ODP)

Cape Cod- another Bayesian-like approachAssumptions(1) E[C i,j ] = κπ i β jκ > 0, π i > 0, β j > 0, β n = 11 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k ≤ n − j(2) accident years [C i,1 , . . . , C i,n ], 1 ≤ i ≤ n areindependent- estimate ĈCC i,nfor ultimates- equivalent to implied assumptions of BF withµ i = κπ i

Cape Cod Estimator- main deficiency of DFM (CL) . . . ultimate claimscompletely depend on the latest diagonal claims not robust (sensitive to outliers)- moreover, in long-tailed LoBs (e.g., liability) thefirst observation is not representative- one possibility is to smoothen outliers from thelatest (diagonal) combine BF and CL intoBenktander-Hovinen method- another way is to make the diagonal claims morerobust Cape Cod method- CC estimator of ultimate from the latestĈ CCi,n= C i,n−i+1 − ĈCC i,n−i+1 +n−1∏j=n−i+1f j Ĉ CCi,n−i+1

Generalised Cape Cod- π i can be interpreted as the premium received foraccident year i- κ reflects the average loss ratio- loss ratio for an accident year using the CLestimate for the ultimate claim̂κ i =Ĉ CLi,nπ i=C i,n−i+1∏ n−1j=n−i+1 f jπ i= C i,n−i+1β n−i+1 π i- initial expected ratio may be set to the same valuederived from an overall weighted ("robusted")average ratio (simple CC method)̂κ CC =n∑i=1β n−i+1 π i∑ nk=1 β n−k+1π k̂κ i =∑ ni=1 C i,n−i+1∑ ni=1 β n−i+1π i

Generalised Cape Cod II- robusted value for latest (diagonal)Ĉ i,n−i+1 = ̂κ CC π i β n−i+1- in the CC method, the CL iteration is applied tothe robusted diagonalĈ CCi,n= C i,n−i+1 + (1 − β n−i+1 )̂κ CC π i- modification of a BF type with modified a prior̂κ CC π i- generalised CC with decay factor 0 ≤ d ≤ 1 constant ̂κ CC is replaced with [̂κ CC1 , . . . , ̂κ CCn ] fordifferent accident yearŝκ CCi= d̂κ CC + (1 − d)̂κ i- factor of 0 . . . no smoothing (CL); factor of 1. . . constant initial ratio (Simple CC)

Generalized estimatingequations- classical approaches to claims reserving problem arebased on the limiting assumption that the claims indifferent years are independent variables- dependencies in the development years classicaltechniques provide incorrect prediction- no distributional assumptions (avoiding distributionmisspecification)

GEE- run-off triangles as one of the most typical typeof actuarial data comprisecorrelated longitudinal data (or generallyclustered data), where an accident yearcorresponds to a subject- claims within “subject” should be considered ascorrelated by nature- incremental claims for accident year i ∈ {1, . . . , n}create a (n − i + 1) × 1 vector X i = [X i,1 , . . . , X i,n−i+1 ] ⊤and define their expectationsEX i = µ i = [µ i,1 , . . . , µ i,n+1−i ] ⊤

Link function and linearpredictor- accident year i and development year j influencethe expectation of claim amount via so-calledlink function g in the following manner:µ i,j = g −1 (z ⊤ i,jθ),where g −1 is referred to as the inverse of scalarlink function g and z i,j is a p × 1 vector of(fictional) covariates that arranges theimpact of accident and development year on theclaim amount through model parameters θ ∈ R p×1

Example of link- e.g., the Hoerl curve with the logarithmic linkfunction can be “coded” by design matrixz i,j = [1, δ 1,i , . . . , δ n,i , 1 × δ 1,j , . . . , n × δ n,j ,and parameters of interestδ 1,j × log 1, . . . , δ n,j × log n] ⊤θ = [γ, α 1 , . . . , α n , β 1 , . . . , β n , λ 1 , . . . , λ n ] ⊤ ,where δ i,j corresponds to the Kronecker’s delta- afterwardslog(µ i,j ) = γ + α i + jβ j + λ j log j

Variance- besides link function g and linear predictor z ⊤ i,j θ (i.e.,mean structure µ i ), one needs to specify thevariance of claim amounts- suppose that the variance of incremental claimscan be expressed asa known function of of their expectationsVarX i,j = φh(µ i,j ),where φ > 0 is a scale (dispersion) parameter

Working correlationmatrix- in the GEE framework, it is not necessary toknow the whole distribution of the response (e.g.,a distribution of the incremental claims) like in theGLM setup- sufficient to specify the variance of X i,j and theworking correlation matrixR i (ϑ) ∈ R (n−i+1)×(n−i+1)for incremental claims in each accident year

Working correlationmatrix II- correlation matrix differs from accident year toaccident year- however, each correlation matrix depends only onthe s × 1 vector of unknown parameters ϑ, whichis the same for all the accident years- consequently, the working covariance matrix ofthe incremental claims isCovX i = φA 1/2iR i (ϑ)A 1/2i,where A i is an (n − i + 1) × (n − i + 1) diagonal matrixwith h(µ i,j ) as the jth diagonal element- the name “working” comes from the fact that it isnot expected to be correctly specified

Choice of workingcorrelation matrix- the simplest case is to assume uncorrelatedincremental claims, i.e.,R i (ϑ) = I n−i+1 = {δ j,k } n−i+1,n−i+1j,k=1- opposite extreme case is an unstructuredcorrelation matrixR i (ϑ) = {ϑ j,k } n−i+1,n−i+1j,k=1such that ϑ j,j = 1 for j = 1, . . . , n − 1 + 1 and R i (ϑ) ispositive definite

Choice of workingcorrelation matrix II- somewhere in between, there lies an exchangeablecorrelation structureR i (ϑ) = {δ j,k + (1 − δ j,k )ϑ} n−i+1,n−i+1j,k=1, ϑ = [ϑ, . . . , ϑ] ⊤- an m-dependentR i (ϑ) = {r j,k } n−i+1,n−i+1j,k=1, r j,k =⎧⎨⎩1, j = k,ϑ |j−k| , 0 < |j − k| ≤ m, ϑ0, |j − k| > m- an autoregresive AR(1) correlation structureR i (ϑ) = {ϑ |j−k| } n−i+1,n−i+1j,k=1, ϑ = [ϑ, . . . , ϑ] ⊤ .

Quasi-likelihood- parameter estimation in the GEE framework isperformed in a way that the theoreticalquasi-likelihood∫ x − µQ(x; µ) =h(µ) dµis used instead of the true log-likelihood function- quasi-likelihood estimate in GEE setup is thesolution of the score-like equation systemn∑i=1[ ∂µi∂θ] ⊤φ −1 A −1/2iR −1i(ϑ)A −1/2i(X i − µ i ) = 0 ∈ R p ,where [∂µ i /∂θ] is a (n − i + 1) × p matrix of partialderivatives of µ i with respect to the unknownparameters θ

Further work- claims generating process: incremental paid claimsX i,j to be the sum of N i,j (independent) claims ofamount Y ki,j , k = 1, . . . , N i,j- Wright’s model- Tweedie compound distribution

Conclusions- CL- GLM- GAM- BF- Bayesian framework- CC- GEE

ReferencesEngland, P. D. and Verrall, R. J. (2002)Stochastic claims reserving in general insurance(with discussion).British Actuarial Journal, 8 (3).Hudecova, S. and Pesta, M. (2012)Generalized estimating equations in claimsreserving.In Komarek, A. and Nagy, S., editors, Proceedings ofthe 27th International Workshop on StatisticalModelling, Vol. 2, p. 555–560, Prague.Wüthrich, M., Merz, M. (2008)Stochastic claims reserving methods in insurance.Wiley finance series. John Wiley & Sons.

Thank you !pesta@karlin.mff.cuni.cz

More magazines by this user
Similar magazines