4.1 Chainladder Underlying Assumptions

Definition 4.1 Notations use for Mack

  • \(c_{i,k} =\) cumulative losses for AY \(i\) @ age \(k\)

  • \(f_k =\) LDF from \(k\) to \(k + 1\), \(k \in [1:I-1]\)

  • \(I =\) size of the triangle

Proposition 4.1 (Chain Ladder Assumption 1) \[\mathrm{E}\left [c_{i,k+1} \mid c_{i,1} \cdots c_{i,k}\right ] = c_{i,k} \times f_k\]

Remark.

  • Expected incremental losses are \(\propto\) losses to date

  • Proportion depends on the age \(k\) of AY

  • Our best estimate of ultimate depends only on the losses to date

    Ignores prior period losses

Corollary 4.1 Restating Chain Ladder Assumption 1:

\[\mathrm{E}\left[ \dfrac{c_{i,k+1}}{c_{i,k}} \mid c_{i,1},...,c_{i,k} \right]\]

Remark.

  • Expected LDF is unbiased

  • Implies that development to \(c_{i,k+1}\) is independent of the size of losses at \(c_{i,k}\)

  • Implies that adjacent LDFs are independent

Proposition 4.2 (Chain Ladder Assumption 2) \[\left \{c_{i,1} \cdots c_{i,I} \right \} \: {\perp\!\!\!\!\perp} \: \left \{c_{j,1} \cdots c_{j,I} \right \} \:\: : \:\: i \neq\ j\]

Remark.

  • Losses in each AYs are \({\perp\!\!\!\!\perp}\) of the losses in other AYs

  • This assumption make our estimate unbiased

Proposition 4.3 (Chain Ladder Assumption 3) \[\mathrm{Var}\left (c_{i,k+1} \mid c_{i,1} \cdots c_{i,k}\right ) = \alpha_k^2 \: c_{i,k}\]

Remark.

  • Variance of the incremental losses is \(\propto\) losses reported to date

  • Proportion depends on the age (\(k\)) of AY (i.e. same for each column but varies for each column)

  • \(\hat{f}_k\) is selected to minimize the variance

4.1.1 Implicit Assumptions

Taking a step back and look at the implicit assumptions we make when using CL

We are making assumptions on how we select the factor \(\hat{f}_k\) and the application:

\[\hat{c}_{i,k+1} = c_{i,k} \times \hat{f}_k\]

Which requires the following assumptions:

  1. Unbiased estimate of each \(f_k\)

    • \(\mathrm{E}\left[ \hat{f}_k \right] = f_k\)

      \(\hat{f}_k\) are representative of the true \(f_k\)

    • Based on assumption 1 & 2 (prop. 4.1 & prop. 4.2)

    • See proof in Mack appendix A

  2. Unbiased estimate of Ultimate

    • \(\mathrm{E}\left[ \hat{c}_{iI} \right] = c_{ik} \times \hat{f}_k \times \cdots \times \hat{f}_{I-1} = \mathrm{E}\left[ c_{iI} \right]\)

      Multiplying the \(\hat{f}_k\)’s by the paid to date will give us an unbiased estimate of the future losses

    • Based on assumption 1 & 2 (prop. 4.1 & prop. 4.2)

    • See proof in Mack appendix C

  3. To use volume weighted average LDF

    • Based on assumption 3 (prop. 4.3)
  4. To calculate the confidence interval

    • Based on assumption 3 (prop. 4.3)

4.1.2 Proof for Assumption 3

To estimate \(f_k\) we can weight the historical LDFs in many different ways, putting it in general terms:

\[\begin{equation} \hat{f}_k = \sum_i \left( \dfrac{c_{i,k+1}}{c_{i,k}} \right) \times w_{i,k} \:\:\:\: : \:\:\:\: \sum_i w_{i,k} = 1 \tag{4.1} \end{equation}\]

Remark. We assume each of the \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) are an unbiased estimate of \(f_k\) (See proof below)

\(\therefore\) Any weighting of them is an unbiased estiamte of \(\hat{f}_k\)

Proof. \[\begin{align} \mathrm{E}\left[ \dfrac{c_{i, k+1}}{c_{i,k}} \right] &= \mathrm{E} \left[ \mathrm{E} \left[ \dfrac{c_{i, k+1}}{c_{i,k}} \mid c_{i,1}, ...,c_{i,k}\right] \right] &\cdots \:\:\: (1)\\ &= \mathrm{E} \left[ \mathrm{E} \left[ c_{i, k+1} \mid c_{i,1}, ...,c_{i,k}\right] / c_{i,k} \right] &\cdots \:\:\: (2)\\ &= \mathrm{E} \left[ c_{i,k} \: f_k \:/ c_{i,k} \right] &\cdots \:\:\: (3)\\ &= \mathrm{E} \left[f_k\right] \\ &= f_k &\cdots \:\:\: (4) \\ \end{align}\]

Remark.

  1. Because the iterative rule \(\mathrm{E}[X] = \mathrm{E}[\mathrm{E}(X \mid Y)]\)

  2. Because \(c_{i,k}\) is scalar since it’s given

  3. From assumption 1 (prop. 4.1)

  4. Because \(c_{i,k}\) is scalar

Remark. When we use weighted volume average \(w_{i,k} = \dfrac{c_{i,k}}{\sum \limits_{j} c_{j,k}}\)


Based on theory of point estimation, among several unbiased estimators, prefrence should be given to the one with the smallest variance

The weights that minimize the variance is inversely proportional to the variance of the item we are weighting, i.e.:

  • We want weights \(w_{i,k}\) for each \(i\) on \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\)

  • The weight we apply to each \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) varies based on the variance of \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\)

  • And if \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) has high variance its weight will be lower to minimize the total variance of our estimate of \(\hat{f}_k = \left( \dfrac{c_{1,k+1}}{c_{1,k}} \right) \times w_{1,k} + \cdots + \left( \dfrac{c_{I,k+1}}{c_{I,k}} \right) \times w_{I,k}\)

  • Mack appendix B has a proof of this

High variance estimate get lower weight:

\[\dfrac{1}{w_{i,k}} \propto \mathrm{Var}\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\]

Since \(c_{i,k}\) is known, we can pull it out of the variance term

\[\dfrac{1}{w_{i,k}} \propto \dfrac{\mathrm{Var}(c_{i,k+1})}{c_{i,k}^2}\]

and we get

\[\begin{equation} w_{i,k} \times \mathrm{Var}(c_{i,k+1}) \propto c_{i,k}^2 \tag{4.2} \end{equation}\]

Recall the weight for the volume weighted average is:

\[\begin{align} w_{i,k} &= \dfrac{c_{i,k}}{\sum \limits_{j} c_{j,k}}\\ w_{i,k} &\propto c_{i,k}\\ \end{align}\]

Applying the above to equation (4.2) we get:

\[\begin{align} c_{i,k} \times \mathrm{Var}(c_{i,k+1}) &\propto c_{i,k}^2 \\ \mathrm{Var}(c_{i,k+1}) &\propto c_{i,k} \\ \mathrm{Var}(c_{i,k+1}) &= \alpha^2_k \times c_{i,k} \\ \end{align}\]

And we have chainladder assumption 3 (Prop. 4.3)

4.1.3 LDF Selections Assumptions

Recall equation (4.1) and (4.2)

Table 4.1: Relationships between weight, variance and residual (Mack)
Weight \(w_{i,k}\) Description Variance Residual (4.10)
1 Simple Average \(\alpha_k^2 \times \mathbf{c_{i,k}^2}\) \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{c_{i,k}^2}}}\)
\(c_{i,k}\) Weighted Average \(\alpha_k^2 \times \mathbf{c_{i,k}}\) \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{c_{i,k}}}}\)
\(c_{i,k}^2\) Least Square2 \(\alpha_k^2 \times \mathbf{1}\) \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{1}}}\)
  1. Note the least square here we are forcing the intercept through the origin

    Assumption for LS is that variance is the same for each exposure year (See Brosius)

  • Use different method for LDF selection based on variance assumption

  • Variance and the weight always multiply to \(c_{j,k}^2\)

4.1.4 Violation of Assumptions

Correlation between on AYs and another

  • Assumption 2 (Prop. 4.2) is violated

  • Not necessarily assumption 1 (Prop. 4.1)

  • e.g. Strong calendar year effects (i.e. faster payment, changing inflation) will lead to correlation along a diagonal

  • Check using calendar year test

A single \(\hat{f}_k\) is not appropriate for all years \(i\)

  • Assumption 1 (Prop. 4.1) is violated

Dependence among columns

  • Assumption 1 (Prop. 4.1) is violated

  • Not necessarily assumption 2 (Prop. 4.2)

    If losses in the follow period are inversely correlated to the losses in the current period, then we’ll have correlation between adjacent LDFs but still maintain independence of AYs

If residuals are not random around zero

  • Assumption 3 (Prop. 4.3) is violated

  • If we see any trends or change in magnitude

  • Check with residual test

4.1.5 Strength/Weakness of Chainladder

Here we are talking about the weakness and not the limitation due to it’s implicit assumptions

Weakness

  • Tail LDFs depend on very few observations

  • High variability in the reported claims in the most recent years lead to uncertainty

    • \(c_{I,1} = 0\) \(\Rightarrow\) \(\hat{R}_{I,1} = 0\), which is not reasonable
  • Results need to be judged by someone who knows the business under consideration

  • Unexpected future changes can make the observations obsolete

Strength

  • User knows exactly how the method works and it’s weakness

  • Can be easily explained to non-actuaries