9.3 Bootstrap Model
Benefits of the bootstrap model:
Allows us to estimate the distribution with very little data
We don’t have to make any assumptions about the underlying distribution (non-parametric)
- The ODP part is the error distribution
ODP bootstrap models:
Incremental claims directly as the response
With the same linear predictor as Kremer (1982)
Using a GLM with log-link function and an ODP Poisson error
Where a specific form of this model is identical to the volume weighted chain ladder
Using bootstrap (sampling residuals with replacement) to estimate the distribution of point estimates
(Instead of simulating from a multivariate normal for a GLM)
9.3.1 GLM Parameters
Mean and variance for each \(q(w,d)\) in the triangle (per table 9.1)
9.3.1.1 Mean and log-mean for \(q(w,d)\)
\[\begin{equation} \mathrm{E}[q(w,d)] = m_{wd} = \exp \left [\alpha_w + \sum_{i=2}^d \beta_i \right] \:\: : \: \: w \in [2, n] \tag{9.1} \end{equation}\] \[\begin{equation} \ln \left( \mathrm{E}[q(w,d)] \right) = \ln(m_{w,d}) = \eta_{w,d} = \alpha_w + \sum_{i=2}^d \beta_i \:\: : \: \: w \in [2, n] \tag{9.2} \end{equation}\]Remark.
\(\alpha\)’s are the individual level parameters
\(\beta\)’s adjust for the development trends after the first development period
- We don’t use \(\beta_1\) which effectively means \(\beta_1 = 0\)
- \(\alpha_i\) and \(\beta_j\) are selected to minimize error between \(\ln(actual) - \ln(forecast)\)
Equivalence for using Venter notation:
\(h(w) = e^{\alpha}\)
\(f(d) = e^{\sum \beta}\)
9.3.1.2 Variance for \(q(w,d)\)
\[\begin{equation} \mathrm{Var}[q(w,d)] = \phi m_{wd}^z \tag{9.3} \end{equation}\]\(\phi\): Dispersion factor
Scale factor estimated as part of the fitting procedure while setting the variance proportional to the mean
Estimated from the residuals
\(z\): Error distribution
Paper focus on \(z = 1\) for Over Dispersed Poisson (ODP)
Specifies the whole mean-variance relationship (not only the first 2 moments)
\(z\) | Distribution |
---|---|
0 | Normal |
1 | Poisson |
2 | Gamma |
3 | Inverse Gaussian |
9.3.2 Fitted Triangle
We can fit the \(\alpha\)’s and \(\beta\)’s defined above using the GLM framework, or the simplified GLM method
9.3.2.1 Parameterize with GLM Framework
Start with a \(3 \times 3\) incremental triangle
w/d | 1 | 2 | 3 |
---|---|---|---|
1 | \(q(1,1)\) | \(q(1,2)\) | \(q(1,3)\) |
2 | \(q(2,1)\) | \(q(2,2)\) | |
3 | \(q(3,1)\) |
Log transform of the triangle
w/d | 1 | 2 | 3 |
---|---|---|---|
1 | \(\ln[q(1,1)]\) | \(\ln[q(1,2)]\) | \(\ln[q(1,3)]\) |
2 | \(\ln[q(2,1)]\) | \(\ln[q(2,2)]\) | |
3 | \(\ln[q(3,1)]\) |
Create a system of equations based on equation (9.2)
\[\begin{equation} \begin{split} \ln[q(1,1)] &= 1\alpha_1 + 0\alpha_2 + 0\alpha_3 + 0\beta_2 + 0\beta_3 \\ \ln[q(2,1)] &= 0\alpha_1 + 1\alpha_2 + 0\alpha_3 + 0\beta_2 + 0\beta_3 \\ \ln[q(3,1)] &= 0\alpha_1 + 0\alpha_2 + 1\alpha_3 + 0\beta_2 + 0\beta_3 \\ \ln[q(1,2)] &= 1\alpha_1 + 0\alpha_2 + 0\alpha_3 + 1\beta_2 + 0\beta_3 \\ \ln[q(2,2)] &= 0\alpha_1 + 1\alpha_2 + 0\alpha_3 + 1\beta_2 + 0\beta_3 \\ \ln[q(1,3)] &= 0\alpha_1 + 0\alpha_2 + 1\alpha_3 + 1\beta_2 + 1\beta_3 \\ \end{split} \tag{9.4} \end{equation}\]Express the above in matrix form
\[\begin{equation} \begin{array}{ccccc} \mathbf{Y} & = & \mathbf{X} &\times & \mathbf{A} \\ & & \alpha_1 \:\:\: \alpha_2 \:\:\: \alpha_3 \:\:\: \beta_2 \:\:\: \beta_3 & &\\ \begin{bmatrix} ln[q(1,1)] \\ ln[q(2,1)] \\ ln[q(3,1)] \\ ln[q(1,2)] \\ ln[q(2,2)] \\ ln[q(1,3)] \\ \end{bmatrix} & = & \begin{bmatrix} 1 & - & - & - & - \\ - & 1 & - & - & - \\ - & - & 1 & - & - \\ 1 & - & - & 1 & - \\ - & 1 & - & 1 & - \\ 1 & - & - & 1 & 1 \\ \end{bmatrix} & \times & \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \beta_2 \\ \beta_3 \\ \end{bmatrix} \end{array} \tag{9.5} \end{equation}\]Use iteratively weighted least squares or MLE1 to solve for the parameters in the in \(\mathbf{A}\) that minimize the squared difference between \(\mathbf{Y}\) and \(\mathbf{S}\), the solution matrix
\[\begin{equation} \mathbf{S} = \begin{bmatrix} ln[m_{1,1}] \\ ln[m_{2,1}] \\ ln[m_{3,1}] \\ ln[m_{2,1}] \\ ln[m_{2,2}] \\ ln[m_{1,3}] \\ \end{bmatrix} \tag{9.6} \end{equation}\]After solving the system of equations we will have:
\[\begin{equation} \begin{split} \ln[m_{1,1}] &= \eta_{1,1} &= \alpha_1 \\ \ln[m_{2,1}] &= \eta_{2,1} &= \alpha_2 \\ \ln[m_{3,1}] &= \eta_{3,1} &= \alpha_3 \\ \ln[m_{1,2}] &= \eta_{1,2} &= \alpha_1 + \beta_2\\ \ln[m_{2,2}] &= \eta_{2,2} &= \alpha_2 + \beta_2\\ \ln[m_{1,3}] &= \eta_{1,3} &= \alpha_1 + \beta_2 + \beta_3\\ \end{split} \tag{9.7} \end{equation}\]The above solution shown as a triangle below
w/d | 1 | 2 | 3 |
---|---|---|---|
1 | \(\ln[m_{1,1}]\) | \(\ln[m_{1,2}]\) | \(\ln[m_{1,3}]\) |
2 | \(\ln[m_{2,1}]\) | \(\ln[m_{2,2}]\) | |
3 | \(\ln[m_{3,1}]\) |
Exponentiate the triangle above to get our fitted (or expected) incremental results of the GLM model
w/d | 1 | 2 | 3 |
---|---|---|---|
1 | \(m_{1,1}\) | \(m_{1,2}\) | \(m_{1,3}\) |
2 | \(m_{2,1}\) | \(m_{2,2}\) | |
3 | \(m_{3,1}\) |
9.3.2.2 Simplified GLM
GLM model = Chainladder w/ volume-weighted averages when:
Variance \(\propto\) Mean
\(\varepsilon(w,d) \sim\) Poisson
A parameter for each row and column (except 1st column)
Benefits:
Replace GLM fitting with much simpler calculation
LDFs are easier to explain
Still works even when there are negative incremental values
Procedure for fitting incremental triangle:
Select LDFs based on vol-wtd
Start from the last cumulative diagonal and divide backwards by each incremental LDFs to get the cumulative fitted triangle
Subtracting out the cumulative diagonals to get your incremental fitted triangle
9.3.3 Residuals
Unscaled Pearson residuals
\[\begin{equation} \begin{split} r_{w,d} & = & \dfrac{A - E}{\sqrt{\mathrm{Var}(E)}} &\\ & = & \dfrac{q(w,d) - m_{wd}}{\sqrt{m^z_{wd}}} &\\ & = & \dfrac{q(w,d) - m_{wd}}{\sqrt{m_{wd}}} & \:\:\:\: \text{Recall }z = 1\text{ for ODP Poisson}\\ \end{split} \tag{9.8} \end{equation}\]Mean and variance as defined above
Residual for the right and bottom corners of the triangle are going to be 0
Because a unique parameter is used for those 2 cells
Alternatively we can use Anscombe residual
We prefer Pearson because its calculation is consistent with the scale parameter \(\phi\)
Scaled Pearson residuals (England & Verrall)
\[\begin{equation} r^S_{w,d} = r_{w,d} \times \underbrace{\sqrt{\dfrac{N}{N-p}}}_{f^{DoF}} \tag{9.9} \end{equation}\]Degrees of freedom adjustment, to effectively allow for over dispersion of the residuals in the sampling process and add process variance to approximate a distribution of possible outcomes
Increase the variability of the pseudo triangle
Standardized residuals (Pinheiro et al.)
\[\begin{equation} r^H_{w,d} = r_{w,d} \times \underbrace{\sqrt{\dfrac{1}{1-H_{i,i}}}}_{f^H_{w,d}} \tag{9.10} \end{equation}\] \[\begin{equation} \mathbf{H} = \mathbf{X}(\mathbf{X}^T\mathbf{WX})^{-1}\mathbf{X}^T\mathbf{W} \tag{9.11} \end{equation}\] \[\begin{equation} \mathbf{W} = \begin{bmatrix} m_{1,1} & 0 & \cdots & 0 \\ 0 & m_{2,1} & 0 & 0 \\ \vdots & 0 & \ddots & \vdots\\ 0 & 0 & \cdots & m_{1,n}\\ \end{bmatrix} \tag{9.12} \end{equation}\]Hat matrix adjustment factor \(f^H_{w,d}\) is based on the diagonal on the hat matrix \(\mathbf{H}\)
(Going down the column of the triangle from left to right)
\(\mathbf{W}\) is a \(2n \times 2n\) matrix
\(\mathbf{X}\) is the design matrix from (9.5)
Benefits:
\(f^H_{w,d}\) account for the exclusion of zero-value residuals
- Or the zero-value residuals will have some variance but we just don’t know what it is yet so we should sample from the remaining residuals but not the zeros
\(f^H_{w,d}\) is an improvement on \(f^{DoF}\)
9.3.4 Dispersion Factor
Dispersion factor
\[\begin{equation} \phi = \dfrac{\sum r_{wd}^2}{N-p} \tag{9.13} \end{equation}\]\[N = \dfrac{n (n+1)}{2}\]
\[p = 2n-1\]
\(N =\) # of data points (including first column unlike Ventor)
- \(N\) can be less than indicated above if the tail incremental developments are all 0’s
\(p =\) # of parameters
One for each row, one for each column minus first column
\(p\) can be less than \(2n-1\) if the later incremental values are all 0’s and therefore not needed for fitting
This calculation is similar to Clark’s \(\sigma^2\) (6.4)
Alternate method for \(\phi\)
\[\phi \sim \phi^H = \dfrac{\sum (r^H_{w,d})^2}{N}\]
- We can still use the same dispersion factor even with the scaled and standardized residuals, this just give us another method to estimate \(\phi\)
You can also use other methods such as orthogonal decomposition or Newton-Raphson to solve for the parameters↩