原创：scikit-learn 在Ubuntu上环境的搭建详解

之前一直想在Ubuntu下搭建一个机器学习的框架，由于忙于各种事情一直拖到先在。终于在上周成功的在Ubuntu下搭建了scikit-learn的学习矿机。

首先介绍一下scikit-learn 机器学习框架，他是非常流行的开源机器学习框架，基于Python的机器学习模块，基于BSD开源许可证。这个项目最早由DavidCournapeau 在2007 年发起的，目前也是由社区自愿者进行维护。Scikit-Learn的官方网站是http://scikit-learn.org/stable/，在上面可以找到相关的Scikit-Learn的资源，模块下载，文档，例程等等。

上菜了、、、、、、、、、、、、

(1) 在终端下安装第三方的科学计算包numpy模块，

sudo apt-get install python-numpy

(2) 安装 scipy

sudo apt-get install python-scipy模块

(3) 安装matplotlib模块

sudo apt-get install matplotlib

注意：这儿注意哥哥模块的安装顺序

注意：在安装scipy模块时或许会出现错误，例如 “Ubuntu 无法定位软件口”，关于这个问题至少有这两种方式处理

(a) 将Ubuntu下的软件来源设置为“国内”

(b) 更新软件包》》》》》当时我是通过这种方式解决的

(4) 安装c/c 编译器

sudo apt-get install bulid-essential

(5) sudo apt-get install python-dev

(6) sudo apt-get install python-setuptools

(7) sudo apt-get install libatlas-dev

(8) sudo apt-get install libatlas3-base

(9) sudo apt-get install python-pip

(10) sudo apt-get install -U sckit-learn

其实到这儿已经安装完了，但是为了更好地方便使用我在这儿有安装了一个python的环境，也就是文本编辑器vim，功能强大

(11) sudo apt-get install vim-gtk

由于安装后的vim界面不够友好，则需要按照下面的步骤进行蛇者，要是你感觉无所谓也可以不用这一步，但是我们追求完美的用户体验的话还在费点劲吧

首先确定syntax on以保证会进行语法高亮

然后将下列添加到最后面

set nu

set btabstop

set nobackuo

set cursorline

set ruler

set autoindent

OK ，相对安装MongoDB而言还是比较简单的，为此我在这了做了一个机器学习的简单的例子，使用的是scikit-learn 数据库中的例子，具体如下：

print __doc__

# code source: GuoDongwei

#licence: BSD 3 clause

import matplotlib.pyplot as plt

import numpy as np

from sklearn import linear_model, datasets

#Load the diabetes dataset

diabetes = datasets.load_diabetes()

#Use only ane feature

diabetes_x = diabetes.data[:, np.newaxis, 2]

#Split the data into training and testing sets

diabetes_x_train = diabetes_x[:-20]

diabetes_x_test = diabetes_x[-20:]

#Split the target into training and testing set

diabetes_y_trian = diabetes.target[:-20]

diabetes_y_test = diabetes.target[-20:]

#create linear regresison object

regr = linear_model.LinearRegression()

# train the model using the traing set

regr.fit(diabetes_x_train, diabetes_y_trian)

# the coefficients

print 'coefficients:n', regr.coef_

# the mean squares error

print 'residual sum of squares:%.2f' % np.mean((regr.predict(diabetes_x_test) - diabetes_y_test)**2)

#explained variance score: 1 is the perfect prediction

print 'variance score: %.2f' % regr.score(diabetes_x_test, diabetes_y_test)

#plot output

plt.scatter(diabetes_x_test, diabetes_y_test, color = 'black')

plt.plot(diabetes_x_test, regr.predict(diabetes_x_test), color = 'blue', linewidth = 3)

#plt.xticks(())

#plt.yticks(())

plt.show()

scikit-learn ubuntu 机器学习开源

0 人点赞