学习回归 1-3 梯度下降法

2022-11-08 13:33:30 浏览数 (1)

梯度下降法

frac{d}{dx}g(x) = 2x - 2

令导函数等于 0,求出极值点,这个点是极大值还是极小值,通过极值点左右的增减性来判断(由导函数在区间范围内的正负判断)。通常我们会绘制一个增减表。

x 取值所在的范围

导数的符号

找到最小值时 x 需要增加或减小

x < 1

-

增加

x > 1

减小

x:=x - eta frac{d}{dx}g(x)
begin{split} x&:=3-1(2times3 - 2) = -1 \ x&:=-1-1(2times-1-2) = 3 \ x&:=3 - 1(2times 3-2) = -1\ &cdots end{split}
begin{split} x&:=3-0.1times(2times 3 - 2) = 2.6\ x&:=2.6-0.1times(2times 2.6-2) = 2.3\ x&:=2.3 -0.1times(2times 2.3 -2) = 2.1\ &cdots end{split}

初始移动得比较快,慢慢地移动会变得非常慢。

现在回到我们的目标函数。

E(theta) = frac{1}{2}sum_{i = 1}^{n}(y^{(i)} - f_{theta}(x^{(i)}))^2
theta_0 := theta_0 - etafrac{partial E}{partial theta_0}\ theta_1 := theta_1 - etafrac{partial E}{partial theta_1}\
frac{partial E}{partial theta_0} = frac{partial E}{partial f_{theta}(x)}frac{partial f_{theta}(x)}{partial theta_0}
begin{split} frac{partial E}{partial f_{theta}(x^{(i)})} &= frac{partial }{partial f_{theta}(x^{(i)})} frac{1}{2}(y^{(i)}-f_{theta}(x^{(i)}))^2 \&= (y^{(i)} - f_{theta}(x^{(i)})) end{split}
frac{partial E}{partial f_{theta}(x)} = sum_{i=1}^{n}(y^{(i)} - f_{theta}(x^{(i)}))

有了

frac{partial E}{partial f_{theta}(x)}

frac{partial f_{theta}(x)}{theta_0}

的计算就比较简单了。

frac{partial f_{theta}(x)}{partial theta_0} =frac{partial}{partial theta_0}(theta_0 theta_1x)=1
begin{split}frac{partial E}{partial theta_0} &= frac{partial E}{partial f_{theta}(x)}frac{partial f_{theta}(x)}{partialtheta_0} \&= sum_{i=1}^{n}(y^{(i)} - f_{theta}(x^{(i)})) times 1 \&= sum_{i=1}^{n}(y^{(i)} - f_{theta}(x^{(i)})) end{split}

同理,

frac{partial E}{partial theta_1} = frac{partial E}{partial f_{theta}(x)}frac{partial f_{theta}(x)}{partialtheta_1}
frac{partial E}{partial f_{theta}(x)}

和前面一样,所以我们只需要计算

frac{partial f_{theta}(x)}{partial theta_1}

的导数。

frac{partial f_{theta}(x)}{partial theta_1} =frac{partial}{partial theta_1}(theta_0 theta_1x)=x
begin{split}frac{partial E}{partial theta_1} &= frac{partial E}{partial f_{theta}(x)}frac{partial f_{theta}(x)}{partialtheta_1} \&= sum_{i=1}^{n}(y^{(i)} - f_{theta}(x^{(i)})) times x^{(i)} \&= sum_{i=1}^{n}(y^{(i)} - f_{theta}(x^{(i)}))x^{(i)} end{split}

通过上述的计算,梯度下降算法的表达式如下:

begin{split} theta_0 &:= theta_0 - etasum_{i=1}^{n}(f_{theta}(x^{(i)}) - y^{(i)})\ theta_1 &:= theta_1 - etasum_{i=1}^{n}(f_{theta}(x^{(i)}) - y^{(i)})x^{(i)} end{split}

References:

《白话机器学习的数学》

0 人点赞