我们将 \(\displaystyle\sum^{m}_{i=1}(y^{i} - ax^{i} - b)^{2}\) 记作为 \(F\),那么有 \(F = \displaystyle\sum^{m}_{i=1}(y^{i} - ax^{i} - b)^{2}\) ,而我们的目的显然是对 \(a\) 和 \(b\) 求偏导。
我们先对 \(b\) 求偏导,过程如下:
\(\frac{\delta F}{\delta b} = \displaystyle\sum^{m}_{i=1}2(y^{i} - ax^{i} - b) * -1\)
\(\frac{\delta F}{\delta b} = -2\displaystyle\sum^{m}_{i=1}(y^{i} - ax^{i} - b)\)
\(\frac{\delta F}{\delta b} = -2(\displaystyle\sum^{m}_{i=1}y^{i} - a\displaystyle\sum^{m}_{i=1}x^{i} - \displaystyle\sum^{m}_{i=1}b)\)
\(\frac{\delta F}{\delta b} = -2(m\overline{y} - am\overline{x} - mb)\)
我们下面再对 \(a\) 求偏导,过程如下:
\(\frac{\delta F}{\delta a} = \displaystyle\sum^{m}_{i=1}2(y^{i} - ax^{i} - b) * -x^{i}\)
\(\frac{\delta F}{\delta a} = -2\displaystyle\sum^{m}_{i=1}(x^{i}y^{i} - a(x^{i} )^{2}- bx^{i})\)
\(\frac{\delta F}{\delta a} = -2(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - a\displaystyle\sum^{m}_{i=1}(x^{i})^{2} - b\displaystyle\sum^{m}_{i=1}x^{i})\)
然后令 \(\frac{\delta F}{\delta b} = -2(m\overline{y} - am\overline{x} - mb) = 0\),得到 \(\overline{y} - a\overline{x} - b = 0\),解得 \(b = \overline{y} - a\overline{x}\)
此时 \(b\) 我们就求了出来,然后再令 \(\frac{\delta F}{\delta a} = -2(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - a\displaystyle\sum^{m}_{i=1}(x^{i})^{2} - b\displaystyle\sum^{m}_{i=1}x^{i}) = 0\),得到 \(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - a\displaystyle\sum^{m}_{i=1}(x^{i})^{2} - mb\overline x = 0\)
然后将 \(b = \overline y - a\overline x\) 带进去,得到 \(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - a\displaystyle\sum^{m}_{i=1}(x^{i})^{2} - (\overline y - a\overline x)m\overline x = 0\)
\(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - a\displaystyle\sum^{m}_{i=1}(x^{i})^{2} - m\overline x\overline y + am\overline x^{2} = 0\)
\(\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - m\overline x\overline y = a(\displaystyle\sum_{i=1}^{m}(x^{i})^{2} - m\overline x^{2})\)
\(a = \frac{\displaystyle\sum^{m}_{i=1}x^{i}y^{i} - m\overline x\overline y}{\displaystyle\sum_{i=1}^{m}(x^{i})^{2} - m\overline x^{2}}\)
又因为 \(\displaystyle\sum^{m}_{i=1}(x_{i} - \overline x)(y_{i} - \overline y) = \displaystyle\sum^{m}_{i=1}(x_{i}y_{i} - x_{i}\overline y - \overline {x}y_{i} + \overline x\overline y) = \displaystyle\sum^{m}_{i=1}x^{i}y^{i} - m\overline x\overline y - m\overline x\overline y + m\overline x\overline y = \displaystyle\sum^{m}_{i=1}x^{i}y^{i} - m\overline x\overline y\)
又因为 \(\displaystyle\sum_{i=1}^{m}(x^{i} - \overline x)^{2} = \displaystyle\sum_{i=1}^{m}((x^{i})^{2} - 2\overline xx^{i} + \overline x^{2}) = \displaystyle\sum_{i=1}^{m}(x^{i})^{2} - 2m\overline x^2 + m\overline x^{2} = \displaystyle\sum_{i=1}^{m}(x^{i})^{2} - m\overline x^{2}\)
所以 \(a = \frac{\displaystyle\sum^{m}_{i=1}(x_{i} - \overline x)(y_{i} - \overline y)} {\displaystyle\sum_{i=1}^{m}(x^{i} - \overline x)^{2}}\)
到此,我们便找到了最理想的 \(a\) 和 \(b\) ,其中 \(a = \frac{\displaystyle\sum^{m}_{i=1}(x_{i} - \overline x)(y_{i} - \overline y)} {\displaystyle\sum_{i=1}^{m}(x^{i} - \overline x)^{2}}\), \(b = \overline{y} - a\overline{x}\)
整个过程还是比较简单的,如果你的数学基础还没有忘记的话,主要是最后的那两步替换比较巧妙。
当然这个式子也不用刻意去记,网上一大堆,需要的时候去查找就是了,重要的是理解整个推导过程。
我们上面推导的是简单线性回归,也就是要求样本只有一个特征,但是显然真实的样本,特征肯定不止一个。如果预测的样本有多个特征的话,那么便称之为多元线性回归,我们后面会介绍。
简单线性回归的实现下面我们就使用代码来实现简单线性回归,先来看一下简单的数据集。
import numpy as np import plotly.graph_objs as go x = np.array([1, 2, 3, 4, 5]) y = np.array([1, 3, 2, 3, 5]) trace0 = go.Scatter(x=x, y=y, mode="markers", marker={"size": 10}) figure = go.Figure(data=[trace0], layout={"showlegend": False, "template": "plotly_dark"}) figure.show()