A Brief Derivation of Linear Regression Coefficients Estimation
Created on July 16, 2023
Written by Some author
Read time: 2 minutes
Summary: A short derivation of the least squares method for estimating the coefficients of a linear regression model.
$$A(b_0, b_1) = \sum_{i=1}^n (y_i -(b_0 + b_1 x_i))^2$$
$$\frac{\partial}{\partial b_0}A(b_0 , b_1) = -\sum_{i=1}^n2(y_i -(b_0 + b_1 x_i))= 0$$
$$\frac{\partial}{\partial b_1}A(b_0 , b_1) = -\sum_{i=1}^n2x_i(y_i -(b_0 + b_1 x_i)) = 0$$
$$ \sum_{i=1}^n y_i = nb_0+ \left(\sum_{i=1}^n x_i\right)b_1 $$
$$\sum_{i=1}^nx_i y_i = \left(\sum_{i=1}^nx_i\right)b_0 + \left(\sum_{i=1}^nx_i^2\right)b_1 $$
$$ \left(\sum_{i=1}^nx_i\right)\left(\sum_{i=1}^n y_i\right) = \left(n\sum_{i=1}^nx_i\right)b_0+ \left(\sum_{i=1}^n x_i\right)^2b_1 $$
$$n\sum_{i=1}^nx_i y_i = \left(n\sum_{i=1}^nx_i\right)b_0 + \left(n\sum_{i=1}^nx_i^2\right)b_1 $$
$$\left(\sum_{i=1}^nx_i\right)\left(\sum_{i=1}^n y_i\right)- n\sum_{i=1}^nx_i y_i = \left(\left(\sum_{i=1}^n x_i\right)^2 - \left(n\sum_{i=1}^nx_i^2\right)\right)b_1$$
We know that Pearson's r is defined as
$$r = \frac{1}{(n-1)s_x s_y} \sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y}) $$
$$r = \frac{1}{(n-1)s_x s_y} \sum_{i=1}^nx_i y_i-x_i \bar{y}-\bar{x}y_i +\bar{x}\bar{y}$$
$$r = \frac{1}{(n-1)s_x s_y} \sum_{i=1}^nx_i y_i-\frac{1}{n}x_i \left(\sum_{j=1}^n y_j\right)-\frac{1}{n}\left(\sum_{j=1}^n x_j\right)y_i +\frac{1}{n^2}\left(\sum_{j=1}^n x_j\right)\left(\sum_{i=1}^n y_i\right)$$
$$-n(n-1)s_x s_yr= \left(\sum_{j=1}^n x_j\right)\left(\sum_{i=1}^n y_i\right)-n\sum_{i=1}^nx_i y_i $$
So we will have
$$-n(n-1)s_xs_yr = \left(\left(\sum_{i=1}^n x_i\right)^2 - \left(n\sum_{i=1}^nx_i^2\right)\right)b_1 \implies $$
$$(n-1)s_xs_yr = \left( \sum_{i=1}^nx_i^2 - \frac{1}{n}\left(\sum_{i=1}^n x_i\right)^2\right)b_1$$
We know that $s_x^2 = \frac{1}{(n-1)} \sum_{i=1}^n (x_i - \bar{x}) ^2 \implies (n-1)s_x^2 = \sum_{i=1}^n (x_i - \bar{x}) ^2=\sum_{i=1}^nx_i^2 -2\bar{x}x_i +\bar{x}^2 = \sum_{i=1}^nx_i^2-\frac{1}{n}\left(\sum_{i=1}^nx_i\right)^2$
So we will have
$$(n-1)s_xs_yr = (n-1)s_x^2b_1$$
$$b_1 = \frac{s_y}{s_x}r$$
$$ \sum_{i=1}^n y_i = nb_0+ \left(\sum_{i=1}^n x_i\right)b_1 $$
$$b_0 = \bar{y} - \bar{x}b_1$$