机器学习简单小例子:19行代码预测鸢尾花品种

2020-07-14 16:32:24 浏览数 (1)

参考

https://www.youtube.com/watch?v=_3xj9B0qqps&t=1372s

导入需要用到的模块
代码语言:javascript复制
import pandas as pd
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
读入数据
代码语言:javascript复制
df = pd.read_csv(r"irisYT-Django-Iris-App-3xj9B0qqps-masteriris.csv")
将数据拆分成训练集和测试集
代码语言:javascript复制
x = ['sepal_length','sepal_width','petal_length','petal_width']

X = df[x]
y = df['classification']

X_train, X_test, Y_train, Y_test = train_test_split(X,y,test_size=0.2,random_state=1)

训练数据集合测试数据集的比例是8:2

训练模型并预测
代码语言:javascript复制
model = SVC(gamma='auto')
model.fit(X_train,Y_train)
predictions = model.predict(X_test)

输入数据预测

代码语言:javascript复制
iris = [1,1,1,1]
results = model.predict([iris])
print(results)

结果results是一个列表

输出模型准确性
代码语言:javascript复制
print(accuracy_score(Y_test,predictions))

运行代码得到结果为 0.966666666667

保存模型
代码语言:javascript复制
pd.to_pickle(model,r"new_model.pickle")

如果需要用这个模型可以直接读入

代码语言:javascript复制
model = pd.read_pickle(r"new_model.pickle")

完整代码

代码语言:javascript复制
import pandas as pd
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
df = pd.read_csv(r"irisYT-Django-Iris-App-3xj9B0qqps-masteriris.csv")
print(df.head())
x = ['sepal_length','sepal_width','petal_length','petal_width']
X = df[x]
y = df['classification']
X_train, X_test, Y_train, Y_test = train_test_split(X,y,test_size=0.2,random_state=1)
model = SVC(gamma='auto')
model.fit(X_train,Y_train)
predictions = model.predict(X_test)
print(accuracy_score(Y_test,predictions))
pd.to_pickle(model,r"new_model.pickle")
model = pd.read_pickle(r"new_model.pickle")
iris = [1,1,1,1]
results = model.predict([iris])
print(results)

重复这个例子主要是因为找到了一个视频教程是利用Django搭建一个简易的web应用预测鸢尾花的品种。

欢迎大家关注我的公众号

小明的数据分析笔记本

0 人点赞