24.2 Pearson’s Correlation \(\rho\)

\[\begin{equation} \rho = \dfrac{\sum_i \tilde{y}_i \tilde{z}_i}{\sqrt{\sum_i \tilde{y}_i^2 \sum_i \tilde{z}_i^2}} = \dfrac{\mathrm{E}[YZ] - \mathrm{E}[Y] \cdot \mathrm{E}[Z]}{\sigma_{y} \cdot \sigma_{z}} \tag{24.1} \end{equation}\]
  • \(\tilde{y}_i = (y_i - \bar{y})\)

  • \(\sigma_y^2 = \mathrm{E}[Y^2] - \mathrm{E}[Y]^2\)

Only appropriate for distn symmetric and have thin tail

Caveat: Value far from the mean will have a disproportionate weight as it focus on the amount of each \(\tilde{y}_i\) and \(\tilde{z}_i\)

Properties:

  • Pearson correlation will stays the same under positive linear transformation on \(Y\) or \(Z\)

  • Monotone function that is not linear might change the Pearson correlation