CAS Exam 7 Study Notes

1.3 Bayesian Credibility

We can’t use the Bayes theorem as in the previous section if we don’t know the underlying distribution

Proposition 1.4 Estimate the ultimate losses using the best linear estimator of Y|X, \(L(x)\)

\[L(x) = (x - \mathrm{E[X]})\dfrac{Cov(X,Y)}{Var(X)} + \mathrm{E[Y]}\]

Y = Ultimate Losses; X = Reported Losses
Use this when we don’t know the distribution of the random variable
\(L(x) = Q(x)\) when \(Q(x)\) is linear

Remark. This is like the Bühlmann method, where \(L\) is a linear function that minimizes \(\mathrm{E}_X\left[\left(Q(X) - L(X)\right)^2\right]\)

If \(L(x) = a + b x\) then we minimize \(\mathrm{E}_X\left[\left(Q(X) - a - bX \right)^2\right]\)

Table 1.2: Intuitive interpretation of \(L(x)\)
Scenarios	Implications	Interpretation
\(x = \mathrm{E[X]}\)	\(L(X) = \mathrm{E[Y]}\)	Losses are coming in as expected, estimate of ultimate losses is unchanged
\(\mathrm{Cov(X,Y)} \approx 0\)	\(L(X) \cong \mathrm{E[Y]}\)	\(X\) and \(Y\) are only loosely related \(\Rightarrow\) Use ELR Method
\(\mathrm{Cov(X,Y)} \ll \mathrm{Var(X)}\)	\(L(X) \cong \mathrm{E[Y]}\)	\(X\) and \(Y\) don’t vary together
\(\mathrm{Cov(X,Y)} \approx \mathrm{Var(X)}\)¹	\(L(X) \approx x + \left[\mathrm{E}[Y] - \mathrm{E}[X] \right]\)	Same as BF Method
\(\mathrm{Cov(X,Y)} \gg \mathrm{Var(X)}\)²	\(X\) and \(Y\) move together, \(Y\) is significantly influenced by \(X\)	Use Dev Method

Also means change in the reported should not affect the IBNR
Also means greater than expected reported amount should lead to an increase in IBNR

1.3.1 Practical Application (LS Development)

Proposition 1.5 (Development Formula 1) Estimate \(\mathrm{E}[X]\), \(\mathrm{Var}(X)\), and \(\mathrm{Cov}(X,Y)\) from data (i.e. a series of past years) assuming a common \(Y\) and \(X\)

This \(L(x)\) here is the same as the least-square estimate as in 1.1

Proof. Start with \(y = a + bx\) and plug in \(a\) and \(b\) from proposition 1.1

We get:

\(\begin{align} y &= (\bar{y} - b\bar{x}) + bx \\ &= \bar{y} + b \left(x - \bar{x}\right) \\ &= \left(x - \bar{x}\right) \dfrac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)} + \bar{y} \\ \end{align}\)

Which equals \(L(x)\) \(\therefore\) the least-square estimate is the best linear estimate of \(Q(x)\)

Remark. If not for sampling error, the least square method will give us the best linear approximation to the Bayesian estimate, regardless of the distributions of \(X\) or \(Y\)

1.3.2 Credibility Form of the Dev’ Formula

Alternative way to express \(L(x)\), following Bühlmann credibility, we express \(L(x)\) in terms of:

Expected Value of the Process Variance (\(EVPV\))

\(\mathrm{E}_Y\left[\mathrm{Var}(X \mid Y )\right]\)
Variance of the Hypothetical Mean (\(VHM\))

\(\mathrm{Var}_Y(\mathrm{E}\left[X \mid Y \right])\)
Sidebar: We can read \(VHM\) as distrust in underwriters and \(EVPV\) distrust in the claims department
Use the method below when the least-square assumption fails

(i.e. Year to year changes in loss and loss distributions are small, or can be corrected for)

Formula below requires additional hypothesis (in paper appendix?)

Proposition 1.6 (Development Formula 2) Suppose that there is a real number \(d \neq 0\) such that \(\mathrm{E}\left[X \mid Y = y \right] = dy\) for all \(y\)

\[L(x) = Z \underbrace{\dfrac{x}{d}}_{\begin{array}{c} \text{Dev'}\\ \text{Method}\\ \end{array}} + (1-Z)\underbrace{\mathrm{E[Y]}}_{ELR}\]

Formula is the credibility weighting of the chainladder estimate and ELR estimate
If \(EVPV = 0\) \(\Rightarrow\) Full weight to the chainladder
If \(VHM = 0\) \(\Rightarrow\) Full weight to the \(\mathrm{E}[Y]\)

Proof. Start with the proposition 1.4 and we set \(\dfrac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)} = \dfrac{1}{d} \dfrac{VHM}{VHM + EVPV}\)

\(\begin{align} L(x) &= (x - \mathrm{E[X]})\dfrac{Cov(X,Y)}{Var(X)} + \mathrm{E[Y]} \\ &= (x - \mathrm{E[X]}) \left \{ \dfrac{1}{d} \dfrac{VHM}{VHM + EVPV} \right \} + \mathrm{E[Y]} \\ &= (x - \mathrm{E[X]}) \left \{ \dfrac{1}{d} Z \right \} + \mathrm{E[Y]} \\ &= Z \dfrac{x}{d} - Z \dfrac{\mathrm{E}[X]}{d} + \mathrm{E}[Y] \\ &= Z \dfrac{x}{d} - Z \mathrm{E}[Y] + \mathrm{E}[Y] \\ &= Z \dfrac{x}{d} + (1 - Z) \mathrm{E}[Y] \\ \end{align}\)

Which is what we have above in proposition 1.6

Proposition 1.7 (Method for Z) Calculate Z to use with formula in proposition 1.6

\(Z = \dfrac{VHM}{VHM + EVPV} = \dfrac{\mathrm{Var_Y(E[X|Y])}}{\mathrm{Var_Y(E[X|Y])}+\mathrm{E_Y[Var(X|Y)]}}\)

\(VHM = d^2 \sigma^2_Y\)
\(EVPV = \sigma^2_d[\sigma^2_Y + \mathrm{E[Y]}^2]\)
\(d =\) % reported

Remark. We use the above \(Z\) when underlying distribution not stable (historical not a good predictor)

LS only works when the underlying distⁿ are stable

We assumes the following

\(d \: {\perp\!\!\!\!\perp} \: X\): reporting speed does not vary with the volume of claims
\(D = \dfrac{X}{Y}\)
- Here we typically assume the \(\sigma_{\frac{X}{Y}}\) does not depend on \(Y\)
Results sensitive to \(\mathrm{E[Y]}\) and \(\mathrm{E[D]}\) but not the \(\sigma\)

Remark. Alternatively, you can use:

\[\begin{equation} Z = \dfrac{b}{c} \tag{1.1} \end{equation}\]

where \(c\) is the CDF and \(b\) is from the LS

This yield the LS results (Where we assume there’s no change in the underlying year to year data)
Assume same \(d\) for any size of \(y\); Not necessarily true for large or small \(y\)

Proof. Start with \(\dfrac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)} = \dfrac{1}{d} \dfrac{VHM}{VHM + EVPV}\)

Based on 1.7 we have \(Z = \dfrac{VHM}{VHM + EVPV}\)

And \(b = \dfrac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)}\) from 1.1

Then we have \(b = \dfrac{1}{d} Z\)

Next, \(\dfrac{1}{d} = \dfrac{1}{\text{% Reported}} = \text{Reported CDF} = c\)

Finally we have \(Z = \dfrac{b}{c}\)

Proposition 1.8 (Poisson-Binomial Special Case)

\(L(x) = x + (1-d)\mathrm{E(Y)}\)

\(Z = d\)
Same as BF
BF is optimal when claim counts follow Poi - Bin

Proof. Start with \(EVPV\) and \(VHM\) under the Poisson-Binomial case as discussed in proposition 1.2

\(EVPV = \mathrm{E}[yd(1-d)] = \mu d(1-d)\)

\(VHM = \mathrm{Var}(yd) = \mu d^2\)

Then \(Z = \dfrac{\mu d^2}{\mu d^2 + \mu d(1-d)} = d\)

And we get what we have from proposition 1.8 \(L(x) = x + (1-d)\mathrm{E(Y)}\)

Remark. Note that \(L(x) = Q(x)\) (as in proposition 1.2) since \(Q(x)\) is linear, so the best linear estimate = the bayesian estimate

Proposition 1.9 (Negative Bin-Binomial Special Case)

\(L(x) = \dfrac{x}{d + p(1-d)} + \dfrac{\mu p (1-d)}{d + p(1-d)}\)

\(Z = \dfrac{d}{d + p(1-d)}\)

Remark. \(VHM\) is larger here than in the Poi-Binomial case while the \(EVPV\) is the same

\(\therefore\) \(Z\) is larger \(\Rightarrow\) Chainladder method gets more weight

Also note that since \(Q(x)\) is still linear, \(L(x) = Q(x)\)

1.3.3 Caseload Effect

In proposition 1.6 we assumed the expected number of claims reported is \(\propto\) number of claims incurred

Not necessarily true: e.g. claim is more likely to be reported in a timely fashion when the caseload (case reserve) is low, and we expect the development ratio \(\dfrac{\mathrm{E}[X \mid Y = y]}{y}\) to be not a constant decreasing function of \(y\)

When \(D\) and \(Y\) not independent the credibility-based development formula still works (i.e. constant development ratio is not essential for a credibility-based development formula)

e.g. \(d\) larger for small \(y\) since small claims are reported more timely, so settle faster
e.g. the opposite situation when large \(y\) has as larger \(d\): For a property book when large weather event happens, report quicker

Proposition 1.10 (Development Formula 3) Supposed there are real numbers \(d \neq 0\) and \(x_0\) such that \(\mathrm{E}[X \mid Y= y] = dy + x_0\) for all \(y\)

\[L(x) = Z \dfrac{x - x_0}{d} + (1-Z)\mathrm{E[Y]}\]

\(Z = \dfrac{VHM}{VHM + EVPV}\)

Remark. Assumptions:

\(\mathrm{E[X|Y=y]} = d \cdot y + x_0\)
- \(x_0\) is for the fixed salary
- \(d \neq 0\)
- Development ratio \(= d + \dfrac{x_0}{y}\)
  
  Which does decrease as \(y\) gets larger
- This gives \(\mathrm{E}[X \mid Y = 0] = x_0 > 0\)
  
  Which is okay consider claims department in real life
Impossible to determine \(x_0\) and \(d\) in practice but this shows that the least square methods still make sense when development ratio varies with caseload