4.1 Chainladder Underlying Assumptions
Definition 4.1 Notations use for Mack
\(c_{i,k} =\) cumulative losses for AY \(i\) @ age \(k\)
\(f_k =\) LDF from \(k\) to \(k + 1\), \(k \in [1:I-1]\)
- \(I =\) size of the triangle
Remark.
Expected incremental losses are \(\propto\) losses to date
Proportion depends on the age \(k\) of AY
Our best estimate of ultimate depends only on the losses to date
Ignores prior period losses
Corollary 4.1 Restating Chain Ladder Assumption 1:
\[\mathrm{E}\left[ \dfrac{c_{i,k+1}}{c_{i,k}} \mid c_{i,1},...,c_{i,k} \right]\]Remark.
Expected LDF is unbiased
Implies that development to \(c_{i,k+1}\) is independent of the size of losses at \(c_{i,k}\)
- Implies that adjacent LDFs are independent
Remark.
Losses in each AYs are \({\perp\!\!\!\!\perp}\) of the losses in other AYs
- This assumption make our estimate unbiased
Remark.
Variance of the incremental losses is \(\propto\) losses reported to date
Proportion depends on the age (\(k\)) of AY (i.e. same for each column but varies for each column)
- \(\hat{f}_k\) is selected to minimize the variance
4.1.1 Implicit Assumptions
Taking a step back and look at the implicit assumptions we make when using CL
- For these implicit assumptions to hold, the 3 underlying assumptions have to be true
We are making assumptions on how we select the factor \(\hat{f}_k\) and the application:
\[\hat{c}_{i,k+1} = c_{i,k} \times \hat{f}_k\]
Which requires the following assumptions:
Unbiased estimate of each \(f_k\)
Unbiased estimate of Ultimate
\(\mathrm{E}\left[ \hat{c}_{iI} \right] = c_{ik} \times \hat{f}_k \times \cdots \times \hat{f}_{I-1} = \mathrm{E}\left[ c_{iI} \right]\)
Multiplying the \(\hat{f}_k\)’s by the paid to date will give us an unbiased estimate of the future losses
See proof in Mack appendix C
To use volume weighted average LDF
- Based on assumption 3 (prop. 4.3)
To calculate the confidence interval
- Based on assumption 3 (prop. 4.3)
4.1.2 Proof for Assumption 3
To estimate \(f_k\) we can weight the historical LDFs in many different ways, putting it in general terms:
\[\begin{equation} \hat{f}_k = \sum_i \left( \dfrac{c_{i,k+1}}{c_{i,k}} \right) \times w_{i,k} \:\:\:\: : \:\:\:\: \sum_i w_{i,k} = 1 \tag{4.1} \end{equation}\]Remark. We assume each of the \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) are an unbiased estimate of \(f_k\) (See proof below)
\(\therefore\) Any weighting of them is an unbiased estiamte of \(\hat{f}_k\)Remark.
Because the iterative rule \(\mathrm{E}[X] = \mathrm{E}[\mathrm{E}(X \mid Y)]\)
Because \(c_{i,k}\) is scalar since it’s given
From assumption 1 (prop. 4.1)
- Because \(c_{i,k}\) is scalar
Based on theory of point estimation, among several unbiased estimators, prefrence should be given to the one with the smallest variance
The weights that minimize the variance is inversely proportional to the variance of the item we are weighting, i.e.:
We want weights \(w_{i,k}\) for each \(i\) on \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\)
The weight we apply to each \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) varies based on the variance of \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\)
And if \(\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\) has high variance its weight will be lower to minimize the total variance of our estimate of \(\hat{f}_k = \left( \dfrac{c_{1,k+1}}{c_{1,k}} \right) \times w_{1,k} + \cdots + \left( \dfrac{c_{I,k+1}}{c_{I,k}} \right) \times w_{I,k}\)
Mack appendix B has a proof of this
High variance estimate get lower weight:
\[\dfrac{1}{w_{i,k}} \propto \mathrm{Var}\left( \dfrac{c_{i,k+1}}{c_{i,k}} \right)\]
Since \(c_{i,k}\) is known, we can pull it out of the variance term
\[\dfrac{1}{w_{i,k}} \propto \dfrac{\mathrm{Var}(c_{i,k+1})}{c_{i,k}^2}\]
and we get
\[\begin{equation} w_{i,k} \times \mathrm{Var}(c_{i,k+1}) \propto c_{i,k}^2 \tag{4.2} \end{equation}\]Recall the weight for the volume weighted average is:
\[\begin{align} w_{i,k} &= \dfrac{c_{i,k}}{\sum \limits_{j} c_{j,k}}\\ w_{i,k} &\propto c_{i,k}\\ \end{align}\]
Applying the above to equation (4.2) we get:
\[\begin{align} c_{i,k} \times \mathrm{Var}(c_{i,k+1}) &\propto c_{i,k}^2 \\ \mathrm{Var}(c_{i,k+1}) &\propto c_{i,k} \\ \mathrm{Var}(c_{i,k+1}) &= \alpha^2_k \times c_{i,k} \\ \end{align}\]
And we have chainladder assumption 3 (Prop. 4.3)
4.1.3 LDF Selections Assumptions
Recall equation (4.1) and (4.2)
Weight \(w_{i,k}\) | Description | Variance | Residual (4.10) |
---|---|---|---|
1 | Simple Average | \(\alpha_k^2 \times \mathbf{c_{i,k}^2}\) | \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{c_{i,k}^2}}}\) |
\(c_{i,k}\) | Weighted Average | \(\alpha_k^2 \times \mathbf{c_{i,k}}\) | \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{c_{i,k}}}}\) |
\(c_{i,k}^2\) | Least Square2 | \(\alpha_k^2 \times \mathbf{1}\) | \(\varepsilon_{i,k} = \dfrac{c_{i,k+1} - c_{i,k} \: \hat{f_k}}{\sqrt{\mathbf{1}}}\) |
Note the least square here we are forcing the intercept through the origin
Assumption for LS is that variance is the same for each exposure year (See Brosius)
Use different method for LDF selection based on variance assumption
Variance and the weight always multiply to \(c_{j,k}^2\)
4.1.4 Violation of Assumptions
Correlation between on AYs and another
Assumption 2 (Prop. 4.2) is violated
Not necessarily assumption 1 (Prop. 4.1)
e.g. Strong calendar year effects (i.e. faster payment, changing inflation) will lead to correlation along a diagonal
Check using calendar year test
A single \(\hat{f}_k\) is not appropriate for all years \(i\)
- Assumption 1 (Prop. 4.1) is violated
Dependence among columns
Assumption 1 (Prop. 4.1) is violated
Not necessarily assumption 2 (Prop. 4.2)
If losses in the follow period are inversely correlated to the losses in the current period, then we’ll have correlation between adjacent LDFs but still maintain independence of AYs
If residuals are not random around zero
Assumption 3 (Prop. 4.3) is violated
If we see any trends or change in magnitude
Check with residual test
4.1.5 Strength/Weakness of Chainladder
Here we are talking about the weakness and not the limitation due to it’s implicit assumptions
Weakness
Tail LDFs depend on very few observations
High variability in the reported claims in the most recent years lead to uncertainty
- \(c_{I,1} = 0\) \(\Rightarrow\) \(\hat{R}_{I,1} = 0\), which is not reasonable
Results need to be judged by someone who knows the business under consideration
Unexpected future changes can make the observations obsolete
Strength
User knows exactly how the method works and it’s weakness
Can be easily explained to non-actuaries