之前总结了几篇混合线性模型的笔记,继续上传。
这部分主要是介绍如何写出似然函数,通过正态分布,线性回归为例子,并通过R语言编程实现。希望大家可以有所收获。
如何写出似然函数,如何使用R语言编程实现:
- 正态分布数据似然函数
- 线性回归似然函数
- 用R语言自带的函数计算极值
1. 正态分布
1.1 正态分布函数
2. 正态分布似然函数推断
2.1 正态密度函数
2.2 联合密度的似然函数
当n个观测值相互独立,他们的似然函数(等价于联合密度函数)为:
2.3 正态分布似然函数
对似然函数,两边求自然对数:
进一步简化:
3 似然函数
3.1 使用R语言写出似然函数
代码语言:javascript复制normal.lik1<-function(theta,y){
mu<-theta[1]
sigma2<-theta[2]
n<-length(y)
logl<- -.5*n*log(2*pi) -.5*n*log(sigma2) - (1/(2*sigma2))*sum((y-mu)**2) return(-logl)
}
使用R的optim
进行迭代,它的迭代方法有:
- Method “BFGS” is a quasi-Newton method (also known as a variable metric algorithm), specifically that published simultaneously in 1970 by Broyden, Fletcher, Goldfarb and Shanno. This uses function values and gradients to build up a picture of the surface to be optimized. Method “CG” is a conjugate gradients method based on that by Fletcher and Reeves (1964) (but with the option of Polak–Ribiere or Beale–Sorenson updates). Conjugate gradient methods will generally be more fragile than the BFGS method, but as they do not store a matrix they may be successful in much larger optimization problems. Method “L-BFGS-B” is that of Byrd et. al. (1995) which allows box constraints, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning. Nocedal and Wright (1999) is a comprehensive reference for the previous three methods. Method “SANN” is by default a variant of simulated annealing given in Belisle (1992). Simulated-annealing belongs to the class of stochastic global optimization methods. It uses only function values but is relatively slow. It will also work for non-differentiable functions. This implementation uses the Metropolis function for the acceptance probability. By default the next candidate point is generated from a Gaussian Markov kernel with scale proportional to the actual temperature. If a function to generate a new candidate point is given, method “SANN” can also be used to solve combinatorial optimization problems. Temperatures are decreased according to the logarithmic cooling schedule as given in Belisle (1992, p. 890); specifically, the temperature is set to temp / log(((t-1) %/% tmax)*tmax exp(1)), where t is the current iteration step and temp and tmax are specifiable via control, see below. Note that the “SANN” method depends critically on the settings of the control parameters. It is not a general-purpose method but can be very useful in getting to a good value on a very rough surface. Method “Brent” is for one-dimensional problems only, using optimize(). It can be useful in cases where optim() is used inside other functions where only method can be specified, such as in mle from package stats4.
3.2 构建一个正态分布数据
代码语言:javascript复制mu = 10sigma = 5
y = rnorm(100,mu,va) # n是100,个数
3.3 使用R的optim
进行迭代求解
代码语言:javascript复制optim(c(2,1),normal.lik1,y=y)
可以看出,平均数为9.7,方差为21.8
3.4 在似然函数中,去掉常数项
代码语言:javascript复制# 将常数项去掉n
ormal.lik2<-function(theta,y){
mu<-theta[1]
sigma2<-theta[2]
n = length(y)
z = (y-mu)/sigma # logl = -0.5*n*log(sigma^2) - (1/(2*sigma^2)*sum((y-mu)^2))
logl = -.5*n*log(sigma2) - (1/(2*sigma2))*sum((y-mu)**2) return(-logl)
}
optim(c(2,1),normal.lik2,y=y)
结果和上面结果一致。
4. 线性回归似然函数
参考
代码语言:javascript复制http://www.solinx.co/archives/800
4.1 建立一个模型
y = ax b
4.2 线性回归模型
4.3 似然函数
4.4 联合密度函数
4.5 对似然函数进行简化
因此函数为:
4.6 生成一个数据
代码语言:javascript复制set.seed(123)
data.x = rnorm(n = 100)
b0 = 10b1 = 5true.y = b0 data.x*b1
noise = rnorm(100,0,2)
data.y = true.y noise
dat = data.frame(mu=1,x=data.x,y=data.y)
代码语言:javascript复制head(dat)
4.7 使用R语言编写回归方程的似然函数
代码语言:javascript复制lm.logl = function(theta,y,x){
n = length(y)
b0 = theta[1]
b1 = theta[2]
sigma = theta[3]
e = y - b0 - b1*x
logl<- -0.5*n*log(2*pi)- n*log(sigma)- sum(e^2)/(2*sigma^2) return(-logl)
}
optim(c(1,1,1),lm.logl,y=dat$y,x=dat$x)
可以看出,b0的估计值为9.7,b1的估计值为4.8(假定的b0=10,b1=5)。
5. 使用lm函数的最小二乘法进行计算
代码语言:javascript复制summary(lm(y~x,data=dat))
代码语言:javascript复制Call:
lm(formula = y ~ x, data = dat)
Residuals:
Min 1Q Median 3Q Max
-3.815 -1.367 -0.175 1.161 6.581
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.7944 0.1951 50.2 <2e-16 ***
x 4.8951 0.2138 22.9 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.941 on 98 degrees of freedom
Multiple R-squared: 0.8425, Adjusted R-squared: 0.8409
F-statistic: 524.4 on 1 and 98 DF, p-value: < 2.2e-16
可以看到,评估的b0=9.7, b1 = 4.8,结果一致
6. 极大似然函数和最小二乘法的关系
对上面的似然函数求偏导
得到的结果和最小二乘法结果一致:
7. 使用最大似然法求解问题的步骤为
一、确定问题的随机变量类型是离散随机变量还是连续随机变量
二、得出问题的概率分布
三、概率函数转为似然函数
四、似然函数取对数
五、求关于某变量的偏导数
六、解似然方程
8 参考资料
代码语言:javascript复制https://medium.com/quick-code/maximum-likelihood-estimation-for-regression-65f9c99f815d
https://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lecture-06.pdf
http://www.solinx.co/archives/800
https://www.ime.unicamp.br/~cnaber/optim_1.pdf