9.8 Practical Issues

Practical issues we might run into with ODP bootstrap

  1. Negative Incremental Values

  2. Non‐Zero Sum of Residuals

  3. Using an L-year Weighted Average

  4. Missing Values

  5. Outliers

  6. Heteroscedasticity

  7. Heteroecthesious Data

  8. Exposure Adjustment

  9. Tail Factors

  10. Fitting a Distribution to ODP Bootstrap Residuals

9.8.1 Negative Incremental Values

GLM doesn’t work with negative incremental values because of \(\ln[q(w,d)]\)

Need to work around this in:

  1. Model fitting (e.g. Step 0 and 1 of the Bootstrap process)

  2. Simulating for process variance with negative means (e.g. Step 4 of the Bootstrap process)

Also additional work around on extreme outcomes from negative values

9.8.1.1 Model Fitting

Method 1: Use \(-ln(abs\{q(w,d)\})\)

\[\begin{equation} Cell_{w,d} = \begin{cases} \ln[q(w,d)] & \text{if } q(w,d) > 0 \\ 0 & \text{if } q(w,d) = 0 \\ -\ln[abs \{ q(w,d) \}] & \text{if } q(w,d) < 0 \\ \end{cases} \tag{9.16} \end{equation}\]

Remark. Doesn’t work when the column sum to a negative value

  • This is done when setting the design matrix (9.5)

Method 2: Subtract a negative constant \(\Psi\)

\[\begin{equation} q^+(w,d) = q(w,d) - \Psi \\ \ln[q^+(w,d)] \text{ for all } Cell_{w,d} \tag{9.17} \end{equation}\]
  • Pick \(\Psi =\) largest negative value in the column

  • Apply (9.17) before solving the GLM system of equations (e.g. (9.2) and (9.4))

  • Then adjust the fitted values by adding back \(\Phi\) to reduce each fitted incremental value

\[\begin{equation} m_{w,d} = m^+_{w,d} + \Psi \tag{9.18} \end{equation}\]
  • Can use this method combined with method 1 to take care of the extra large negative ones

  • Need to make use the absolute value for the residual and re-sampling formula, modify (9.8) and (9.14) with below:

\[\begin{equation} r_{wd} = \dfrac{q(w,d)-m_{wd}}{\sqrt{abs\{m^z_{wd}\}}} \tag{9.19} \end{equation}\] \[\begin{equation} q^*(w,d) = m_{wd} + r^*_p \sqrt{abs\{m^z_{wd}\}} \tag{9.20} \end{equation}\]

Method 3: Use simplified GLM

  • Use ODP bootstrap (i.e. Chainladder with volume weighted average LDFs)

  • This will yield different estimate than using the GLM framework with adjustment 1 or 2

9.8.1.2 Simulating Negative Values

From above, we might have the fitted \(m_{wd}\) that are negative, which will be an issue when used in Step 4 of the bootstrap simulation, when we need to model the process variance with \(Gamma(m_{wd},\phi m_{wd})\)

  • Since \(Gamma\) only takes positive parameters

Adjustment to the Gamma Distribution with negative \(m_{wd}\)

\[\begin{equation} Gamma(abs\{m_{wd}\}, \phi abs\{m_{wd}\}) + 2m_{wd} \tag{9.21} \end{equation}\]
  • This will maintain the right skew of Gamma while having the mean of \(m_{wd}\)

  • Alternatively if we use \(-Gamma(abs\{m_{wd}\}, \phi abs\{m_{wd}\})\) it’ll flip the curve to skew left

9.8.1.3 Extreme Outcomes from Negative Values

Column with negative mean in the early ages can results in vary large LDFs (and lead to simulated outcomes that are 1,000 times greater than our mean)

  • Negative mean causes one column of cumulative values to sum close to 0 and the next to sum to a much larger number resulting in extremely large LDF and there for projection that are extremely large

  • Need to address this as it’ll throw off the mean even if you don’t care about the high percentiles

3 options to address this:

  1. Remove the extreme iterations

    Beware of understating the the likelihood of extreme outcomes

  2. Recalibrate the Model

    • First need to identify the source of the negative losses

    • Review data used and parameter selection

      • e.g. remove the AYs that might not represent current behavior

      • e.g. if due to S&S then you can just model them separately and then correlate them during simulation

  3. Limit Incremental Losses to 0

    Either with the simulated mean (Step 2) or the process var step (Step 4)

    • Replace with negatives with 0s

    • Can just do it in certain columns

9.8.2 Non-Zero Sum of Residuals

Residuals are supposed to be iid with mean zero and constant variance

\(\therefore\) Sum of our residuals from the triangle should be 0

  • Not necessarily the case since this is just a sample

Consequence: Simulated outcomes will be higher than the mean if sum of residuals are positive (and vice versa)

2 options to address this:

  1. Keep it if we believe this to be characteristics of the data set

  2. Add a constant to each non-zero residual so that it sums to 0

    Then sample from the adjusted residuals

If residuals are significantly different from zero then the fit of the model should be questioned

9.8.3 Using L-year Weighted Average

Select LDFs based on the latest \(L\) years

GLM Bootstrap

  • Only use \(L+1\) diagonals of data to get \(L\) diagonals of LDFs

  • Excluded diagonals are given zero weight and we’ll have less CY trend parameter (if we’re using it)

  • In the simulation we’ll only sample residuals for the trapezoid used to parameterize the model

    (since that’s all we’ll need to estimate parameters)

Simplified GLM

  1. Get L-year weighted average LDFs

  2. Will only have residuals (to sample from) for the most recent L + 1 diagonals

  3. In the simulation we’ll create the entire resampled triangle

    (Since we need the cumulative losses for each row)

  4. For projection using the resampled triangle we’ll still only use the L-year average LDFs

The 2 methods will results in different results

  • GLM Bootstrap: Models the incremental losses in the trapezoid

  • Simplified GLM: Models the same losses but in relation to the cumulative losses, which include the non-modeled losses in the diagonals excluded

9.8.4 Missing Value

ODP Bootstrap:

Missing data impact:

  • LDFs

  • Fitted triangle (if missing value lies on the most recent diagonal)

  • Residuals

  • Degree of freedom

Solutions:

  • Impute from surrounding values

  • Modify LDFs to exclude missing value

Similar to the L-year weighted average:

  • Missing value will be resampled so the cumulative losses can be calculated

  • Projection from the resampled triangle will exclude the missing cell for resampled LDF selection

GLM Bootstrap

Impact on the is limited, we’ll just have less observations

9.8.5 Outliers

Remove outliers if they are not representative of the variability of the losses, below are the options:

  • Remove the entire row (easy if it’s the 1st row of the triangle)

  • Remove the values and treat them as missing values

  • Not use the residual but do create a sampled value in that cell

Significant number of outliers might indicate bad model fit

GLM Bootstrap

  • Pick new parameters (grouping parameters)

  • Change the error term distribution from \(z=1\)

ODP Bootstrap

  • Use L-year weighted average

  • Heteroscedasticity may exist

  • Since we dont’ make a distribution assumption, the number of outliers could mean the data is quite skewed and it’s appropriate that is showing up in the simulation

9.8.6 Heteroskedasticity

Issue of non constant variance

  • ODP bootstrap assumes residuals are \(iid\) with constant variance

  • No longer possible to sample the residuals from the whole triangle with heteroskedasticity

GLM Bootstrap has the additional flexibility of choosing parameters to alleviate heteroscedasticity

For ODP Bootstrap: 3 ways to deal with heteroscedasticity below

  • They also work for GLM Bootstrap

9.8.6.1 Stratified Sampling

Stratified Sampling

  1. Split the triangle into groups with similar variance

  2. Only sample residuals from the same group

Cons

  • Each group may not be that large, which limits the amount of variability in the possible outcomes

9.8.6.2 Hetero-Adjustment to the Residuals

Calculate a hetero-adjustment factor to scale the residuals to the same level:

  1. Group the residuals with similar then calculate the \(\sigma\) of the residuals in each group \(i\)

  2. Hetero-adjustment factor: \(h^i\)

    i.e. The largest \(\sigma\) \(\div\) each group’s \(\sigma\)

\[\begin{equation} h_i = \dfrac{\sigma \left( \bigcup_1^j r^H_{wd} \right)}{\sigma \left( \bigcup_i r^H_{wd} \right)} \:\: : \:\: \text{for each } 1 \leq i \leq j \tag{9.22} \end{equation}\]
  1. Scale up the residuals:

    Residual (9.8) \(\times\) Hat Matrix Factor (9.10) \(\times\) Hetero Factor (9.22)

    \[\begin{equation} r_{wd}^{iH} = r_{wd} \times f_{wd}^H \times h^i \tag{9.23} \end{equation}\]
    • \(h^i\) here is based on the group we draw from
  2. Need to divide the sampled residual by \(h^i\) to reflect the variability of group \(i\)

    \[\begin{equation} q^{i*}(w,d) = m_{wd} + \dfrac{r^{i*}}{h^i}\sqrt{m_{wd}} \tag{9.24} \end{equation}\]
    • \(h^i\) here is based on the group we’re simulating for
  3. Adjust the variance for the process variance step in the simulation

\[\begin{equation} Gamma(m_{wd}, \dfrac{\phi m_{wd}}{h^i}) \tag{9.25} \end{equation}\]
Remark. The hetero adjustment factors are new parameters and will impact degrees of freedom and will impact the scale parameter (9.13) and the degrees of freedom adjustment factor (9.9)

9.8.6.3 Non-constant Scale Parameters

Adjust the dispersion factor \(\phi\) as well as the residuals (similar to above)

  1. Calculate the hetero adjustment factor \(h_i\) using formula (9.27) below:
\[\begin{equation} h_i = \sqrt{\dfrac{\phi}{\phi_i}} \tag{9.26} \end{equation}\]
  1. Perform step 3 and 4 from the hetero adjustment method above

    (i.e. Equation (9.23) and (9.24))

  2. Calculate \(\phi_i\) for each homogenious residual group \(i\) (\(n_i\) = number of residuals in group):

\[\begin{equation} \phi_i = \dfrac{N}{N-p}\dfrac{\sum_{w,d \in \{i\}}r^2_{w,d}}{n_i} \tag{9.27} \end{equation}\]
  1. Use \(\phi_i\) for the process variance step

Remark.

  • The \(\phi_i\) here also amount to new parameters that will impact the degrees of freedom adjustment factor (9.9)

  • The hetero adjustment factor (9.26) is more theoretically sound but in practice very similar to (9.22)

9.8.7 Heteroecthesious Data

ODP bootstrap requirements:

  1. Symmetrical shape (annual by annual, quarter by quarterly, etc triangles)

  2. Homoecthesious data (similar exposure)

Heteroecthesious = Accident years have different level of exposures

Here we are focusing on heteroecthesious due to interim evaluation dates:

  1. Partial first development period

  2. Partial latest calendar period

9.8.7.1 Partial First Development Period

This means the entire first development period is shorter than the rest

  • e.g. Annual data evaluated as of 6/30 with 1/1-12/31 AYs

    We’ll have a triangle with development periods @6, 18, 30, 42, etc

Pearson residuals use the square root of the fitted value to make them all exposure independent (debatable…)

  • \(\therefore\) No impact to residuals

Adjustment: Scale down the most recent AY projection to the appropriate exposure period (e.g. half the exposure based on example above), we have 2 options:

  • Prorate the mean of the incremental cells for the latest AY between step 3 and 4 of the bootstrap process and then proceed to Step 4 for the process variance as usual

  • Prorate the simulated incremental cells for the latest AY after the process variance step (Step 4)

9.8.7.2 Partial Latest Calendar Period

This is where the latest diagonal is partial diagonal

  • e.g. Evaluate in between typical data evaluation date

    Evaluation @6/30 for a 1/1-12/31 AYs and 12-24-36 triangle

  • Similar problem as partial first development period + partial data in most recent diagonal

ODP Bootstrap

Select LDF by excluding latest diagonal or prorating the latest diagonal to full year

Adjusted simulation process

  1. Calculate sampled triangle as usual (diagonal will be of full year)

  2. Calculate full year LDFs and Ultimate as usual

  3. Additional steps:

  • De-annualize the diagonal

  • Interpolate the full year LDFs to match the diagonal

  • Forecast loss

  • Scale down the latest AY similar to the partial AY adjustment

  1. No change

GLM Bootstrap

  • Should be something similar

9.8.8 Exposure Adjustment

Adjustment for when exposure changed dramatically over the years (e.g. rapid growth or run off)

ODP Bootstrap

  • Divide losses by exposure (model loss cost)

  • Need to multiply the simulated results by the exposure (after the process variance step)

GLM Bootstrap

  • Adjust losses by exposure similar to above

  • Fit to the exposure adjusted losses should be exposure weighted

    (i.e. exposure adjusted losses with higher exposure are assumed to have lower variance, see Anderson et al. (2007))

  • This will need fewer AY parameters since the exposure adjustment should capture a lot of the difference between AYs

9.8.9 Parametric Bootstrapping

ODP Bootstrap

  • See CAS Tail Factor Working Party Report (2013)

  • Add tail factor to the algorithm by assuming the factor follows a distribution (other considerations such as process variance, hetero-adj can all be extended to include the tail factors)

  • Should be an extrapolation of the incremental tail factors (instead of a single tail factor to ultimate)

  • Tail factors typically have \(\sigma <\) 50% of the tail factor - 1

    (But should compare to the \(\sigma\) of the AtA factors leading up to the tail in both the actual and simulated data)

GLM Bootstrap

  • Continue to use the last \(\beta_d\) to estimate the tail by continuing to apply it (similarly for CY parameter)

9.8.10 Fitting a Distribution to ODP Bootstrap Residuals

Data points from triangle may not be representative of the underlying distribution

  • Whether the most extreme observation is a 1-in-100, 1-in-1000 event

Alternative is to fit a distribution to the residuals and sample from the distribution instead i.e. parametric bootstrapping