MINST介绍
MNIST 数据集来自美国国家标准与技术研究所(National Institute of Standards and Technology )。训练集 (training set) 由来自 250 个不同人手写的数字构成, 其中 50% 是高中学生, 50% 来自人口普查局 (the Census Bureau) 的工作人员,测试集(test set) 也是同样比例的手写数字数据。
每张图片由
个像素点组成,标签即每个图片中的数字。
我们需要做的就是通过算法让电脑能够识别出图片中的数字,是不是像识别验证码一样。 本文会介绍两种方法:
- softmax回归
- 卷积神经网络(CNN)
softmax回归
- 读取数据 首先读取数据,MINST数据集中每个图片都是
的的图片,将其展平得到一个784维的向量,标签是
之间的数字,也就是一个10维向量,So代码如下所示。input_x,input_y
在这里只是占位符,并不是真正的MINST数据。
代码语言:javascript复制Tip: TensorFlow可以自动下载MINST数据集,而且很容易失败,所以建议还是自己从网上下载好MINST数据集再加载。
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
input_y = tf.placeholder( tf.float32, [None, 10], name='input_y')
- 初始化变量 定义权重(
)和偏置(
),这边只是初始化,大小都无所谓,我们这边都定义为0。
代码语言:javascript复制weight = tf.Variable(tf.zeros([784, 10]))
bias = tf.Variable(tf.zeros([10]))
- 定义模型
tf.matmul
是tensorflow里面内置的矩阵相乘方法,对于结果
再进行softmax转换,便得到我们最后的分类结果y。 代码很简单,就一行:
代码语言:javascript复制y = tf.nn.softmax(tf.matmul(input_x, weight) bias)
- 损失函数和优化器 我们采用交叉熵和梯度下降法分别作为损失函数和优化器,代码如下:
cross_entropy = - tf.reduce_sum(input_y * tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
correct_prediction = tf.equal( tf.argmax(y, 1), tf.argmax(input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
- 迭代 模型都定义完了,接下来就是让他跑起来吧。我这边总共迭代了1w次,每次取50张图片,最后测试集上准确率在92%左右。
- 完整代码:
# -*- coding: utf-8 -*-
# @author: Awesome_Tang
# @date: 2018-12-16
# @version: python2.7
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
class constant(object):
classes = 10 # 类别数
alpha = 0.01 # 学习率
steps = 10000 # 迭代次数
batch_size = 50 # 每批次训练样本数
print_per_batch = 100 # 每多少轮输出一次结果
class SoftMax():
def __init__(self, constant):
self.constant = constant
self.mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
self.input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
self.input_y = tf.placeholder(
tf.float32, [None, self.constant.classes], name='input_y')
self.run_model()
def run_model(self):
# define variables: weights and biases
weight = tf.Variable(tf.zeros([784, 10]))
bias = tf.Variable(tf.zeros([10]))
# define model
y = tf.nn.softmax(tf.matmul(self.input_x, weight) bias)
# define loss function
cross_entropy = - tf.reduce_sum(self.input_y * tf.log(y))
train_step = tf.train.GradientDescentOptimizer(
self.constant.alpha).minimize(cross_entropy)
correct_prediction = tf.equal(
tf.argmax(y, 1), tf.argmax(self.input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# initial variables
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(self.constant.steps):
batch = self.mnist.train.next_batch(self.constant.batch_size)
if i % self.constant.print_per_batch == 0:
train_accuracy = accuracy.eval(session=sess,
feed_dict={self.input_x: batch[0], self.input_y: batch[1]})
print("step %d, train_accuracy %g" % (i, train_accuracy))
train_step.run(session=sess, feed_dict={self.input_x: batch[0], self.input_y: batch[1]})
print("test accuracy %g" % accuracy.eval(session=sess,
feed_dict={self.input_x: self.mnist.test.images, self.input_y: self.mnist.test.labels}))
sess.close()
if __name__ == "__main__":
SoftMax(constant)
'''
===============================
out:
step 0, train_accuracy 0.14
step 100, train_accuracy 0.84
step 200, train_accuracy 0.96
step 300, train_accuracy 0.9
step 400, train_accuracy 0.94
step 500, train_accuracy 0.9
···
step 9400, train_accuracy 0.92
step 9500, train_accuracy 0.94
step 9600, train_accuracy 0.92
step 9700, train_accuracy 0.9
step 9800, train_accuracy 0.94
step 9900, train_accuracy 0.88
test accuracy 0.9222
===============================
'''
卷积神经网络(CNN)
我们通过softmax回归取得了92%的准确率,似乎还不错,但实际上这个结果是比较差的,目前准确率最高应该达到了99.7%以上,So尝试了softmax之后,我们再来试下CNN,看究竟结果如何。 读取数据就不赘述了,与上面一样。因为我们准备尝试进行两次卷积和池化,所以为了让代码看起来更简洁些,我们将其以函数的形式写出:
代码语言:javascript复制def weight_variable(self, shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(self, shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(self, x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
def max_pool_2x2(self, x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
卷积&池化
- 首先第一层卷积和池化,输入为一个
的图像,经过拥有32个
卷积核的layer进行卷积,输出为32个
的图像,然后再经过池化层采样,最终输出为32个
的图像。
代码语言:javascript复制# 第一层: 卷积
x_image = tf.reshape(self.input_x, [-1, 28, 28, 1])
w_cv1 = self.weight_variable([3, 3, 1, 32])
b_cv1 = self.bias_variable([32])
h_cv1 = tf.nn.relu(self.conv2d(x_image, w_cv1) b_cv1)
h_mp1 = self.max_pool_2x
- 经过第一层卷积我们现在的输入的便是是32个
的图像,经过拥有64个
卷积核的layer进行第二次卷积,输出为64个
的图像(本应是输出
个,但为了降低复杂度,卷积层会对32个图像进行一次累加),再经过池化层采样,输出为64个
的图像。
代码语言:javascript复制# 第二层: 卷积
w_cv2 = self.weight_variable([3, 3, 32, 64])
b_cv2 = self.bias_variable([64])
h_cv2 = tf.nn.relu(self.conv2d(h_mp1, w_cv2) b_cv2)
h_mp2 = self.max_pool_2x2(h_cv2)
- 经过两层卷积,我们得到64个
的图像,将其展平得到一个
维的向量,输入到一个128维的全连接层,接着再输入到一个10维的softmax层,这部分与上面的softmax类似,代码如下:
代码语言:javascript复制# 第三层: 全连接
W_fc1 = self.weight_variable([7*7*64, 128])
b_fc1 = self.bias_variable([128])
h_mp2_flat = tf.reshape(h_mp2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_mp2_flat, W_fc1) b_fc1)
# 第四层: Dropout层
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
# 第五层: softmax输出层
W_fc2 = self.weight_variable([128, 10])
b_fc2 = self.bias_variable([10])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) b_fc2)
- 优化器和损失函数这边选择Adam优化器和交叉熵。
cross_entropy = -tf.reduce_sum(self.input_y * tf.log(y_conv))
train_step = tf.train.AdamOptimizer(self.constant.alpha).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1),tf.argmax(self.input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
loss = tf.reduce_mean(cross_entropy)
- 最后跑起来吧,完整代码如下:
# -*- coding: utf-8 -*-
# @author: Awesome_Tang
# @date: 2018-12-15
# @version: python2.7
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
class constant(object):
"""
CNN 模型参数
"""
classes = 10 # 类别数
alpha = 1e-4 # 学习率
keep_prob = 0.5 # 保留比例
steps = 10000 # 迭代次数
batch_size = 50 # 每批次训练样本数
tensorboard_dir = 'tensorboard/CNN' # log输出路径
print_per_batch = 100 # 每多少轮输出一次结果
save_per_batch = 10 # 每多少轮存入tensorboard
class CNN():
def __init__(self, constant):
self.constant = constant
self.mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
self.input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
self.input_y = tf.placeholder(
tf.float32, [None, self.constant.classes], name='input_y')
self.CNN_model()
def weight_variable(self, shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(self, shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(self, x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
def max_pool_2x2(self, x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
def CNN_model(self):
# 第一层: 卷积
x_image = tf.reshape(self.input_x, [-1, 28, 28, 1])
w_cv1 = self.weight_variable([3, 3, 1, 32])
b_cv1 = self.bias_variable([32])
h_cv1 = tf.nn.relu(self.conv2d(x_image, w_cv1) b_cv1)
h_mp1 = self.max_pool_2x2(h_cv1)
# 第二层: 卷积
w_cv2 = self.weight_variable([3, 3, 32, 64])
b_cv2 = self.bias_variable([64])
h_cv2 = tf.nn.relu(self.conv2d(h_mp1, w_cv2) b_cv2)
h_mp2 = self.max_pool_2x2(h_cv2)
# 第三层: 全连接
W_fc1 = self.weight_variable([7*7*64, 128])
b_fc1 = self.bias_variable([128])
h_mp2_flat = tf.reshape(h_mp2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_mp2_flat, W_fc1) b_fc1)
# 第四层: Dropout层
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
# 第五层: softmax输出层
W_fc2 = self.weight_variable([128, 10])
b_fc2 = self.bias_variable([10])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) b_fc2)
# 优化器&损失函数
cross_entropy = -tf.reduce_sum(self.input_y * tf.log(y_conv))
train_step = tf.train.AdamOptimizer(
self.constant.alpha).minimize(cross_entropy)
correct_prediction = tf.equal(
tf.argmax(y_conv, 1), tf.argmax(self.input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
loss = tf.reduce_mean(cross_entropy)
# tensorboard配置
tf.summary.scalar("loss", loss)
tf.summary.scalar("accuracy", accuracy)
merged_summary = tf.summary.merge_all()
writer = tf.summary.FileWriter(self.constant.tensorboard_dir)
# 初始化变量
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(self.constant.steps):
batch = self.mnist.train.next_batch(
self.constant.batch_size)
if i % self.constant.print_per_batch == 0:
train_accuracy = accuracy.eval(session=sess,
feed_dict={self.input_x: batch[0], self.input_y: batch[1], keep_prob: 1.0})
print("step %d, train_accuracy %g" % (i, train_accuracy))
train_step.run(session=sess, feed_dict={self.input_x: batch[0], self.input_y: batch[1],
keep_prob: self.constant.keep_prob})
if i % self.constant.save_per_batch == 0:
s = sess.run(merged_summary, feed_dict={
self.input_x: batch[0], self.input_y: batch[1], keep_prob: 1.0})
writer.add_summary(s, i)
print("test accuracy %g" % accuracy.eval(session=sess,
feed_dict={self.input_x: self.mnist.test.images, self.input_y: self.mnist.test.labels,
keep_prob: 1.0}))
sess.close()
if __name__ == "__main__":
CNN(constant)
'''
================================
out:
step 0, train_accuracy 0.04
step 100, train_accuracy 0.68
step 200, train_accuracy 0.74
step 300, train_accuracy 0.72
step 400, train_accuracy 0.78
step 500, train_accuracy 0.84
···
step 9000, train_accuracy 1
step 9100, train_accuracy 0.98
step 9200, train_accuracy 1
step 9300, train_accuracy 1
step 9400, train_accuracy 0.98
step 9500, train_accuracy 0.94
step 9600, train_accuracy 1
step 9700, train_accuracy 0.98
step 9800, train_accuracy 0.96
step 9900, train_accuracy 1
test accuracy 0.9826
================================
'''
- 迭代1w次之后准确率大概98%左右,相对于softmax回归还是有不少的提升。我们可以通过tensorboard看到整个训练过程中准确率和损失值的变化过程,总感觉这样的曲线好完美???
tensorboard
tip: 训练结果写入到tensorboard scalars的代码已经包含在上面了,执行后训练结果会保存在在根目录下面的
tensorboard/CNN
之中,在终端中执行tensorboard --logdir=tensorboard/CNN
就可以看到了。