Hello TensorFlow : MINST数据集识别

2018-12-27 15:49:53 浏览数 (1)

MINST介绍

MNIST 数据集来自美国国家标准与技术研究所(National Institute of Standards and Technology )。训练集 (training set) 由来自 250 个不同人手写的数字构成, 其中 50% 是高中学生, 50% 来自人口普查局 (the Census Bureau) 的工作人员,测试集(test set) 也是同样比例的手写数字数据。

每张图片由

个像素点组成,标签即每个图片中的数字。

我们需要做的就是通过算法让电脑能够识别出图片中的数字,是不是像识别验证码一样。 本文会介绍两种方法:

  • softmax回归
  • 卷积神经网络(CNN)

softmax回归

  • 读取数据 首先读取数据,MINST数据集中每个图片都是

的的图片,将其展平得到一个784维的向量,标签是

之间的数字,也就是一个10维向量,So代码如下所示。input_x,input_y在这里只是占位符,并不是真正的MINST数据。

Tip: TensorFlow可以自动下载MINST数据集,而且很容易失败,所以建议还是自己从网上下载好MINST数据集再加载。

代码语言:javascript复制
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
input_y = tf.placeholder( tf.float32, [None, 10], name='input_y')
  • 初始化变量 定义权重(

)和偏置(

),这边只是初始化,大小都无所谓,我们这边都定义为0。

代码语言:javascript复制
weight = tf.Variable(tf.zeros([784, 10]))
bias = tf.Variable(tf.zeros([10]))
  • 定义模型 tf.matmul是tensorflow里面内置的矩阵相乘方法,对于结果

再进行softmax转换,便得到我们最后的分类结果y。 代码很简单,就一行:

代码语言:javascript复制
y = tf.nn.softmax(tf.matmul(input_x, weight)   bias)
  • 损失函数和优化器 我们采用交叉熵和梯度下降法分别作为损失函数和优化器,代码如下:
代码语言:javascript复制
cross_entropy = - tf.reduce_sum(input_y * tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
correct_prediction = tf.equal( tf.argmax(y, 1), tf.argmax(input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
  • 迭代 模型都定义完了,接下来就是让他跑起来吧。我这边总共迭代了1w次,每次取50张图片,最后测试集上准确率在92%左右。
  • 完整代码:
代码语言:javascript复制
# -*- coding: utf-8 -*-

# @author: Awesome_Tang
# @date: 2018-12-16
# @version: python2.7

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data


class constant(object):
    classes = 10  # 类别数
    alpha = 0.01  # 学习率
    steps = 10000  # 迭代次数
    batch_size = 50  # 每批次训练样本数
    print_per_batch = 100  # 每多少轮输出一次结果



class SoftMax():

    def __init__(self, constant):
        self.constant = constant
        self.mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
        self.input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
        self.input_y = tf.placeholder(
            tf.float32, [None, self.constant.classes], name='input_y')

        self.run_model()

    def run_model(self):
        # define variables: weights and biases
        weight = tf.Variable(tf.zeros([784, 10]))
        bias = tf.Variable(tf.zeros([10]))

        # define model
        y = tf.nn.softmax(tf.matmul(self.input_x, weight)   bias)

        # define loss function
        cross_entropy = - tf.reduce_sum(self.input_y * tf.log(y))
        train_step = tf.train.GradientDescentOptimizer(
            self.constant.alpha).minimize(cross_entropy)
        correct_prediction = tf.equal(
            tf.argmax(y, 1), tf.argmax(self.input_y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

        # initial variables
        sess = tf.Session()
        sess.run(tf.global_variables_initializer())

        for i in range(self.constant.steps):
            batch = self.mnist.train.next_batch(self.constant.batch_size)
            if i % self.constant.print_per_batch == 0:
                train_accuracy = accuracy.eval(session=sess,
                                               feed_dict={self.input_x: batch[0], self.input_y: batch[1]})
                print("step %d, train_accuracy %g" % (i, train_accuracy))
            train_step.run(session=sess, feed_dict={self.input_x: batch[0], self.input_y: batch[1]})

        print("test accuracy %g" % accuracy.eval(session=sess,
                                                 feed_dict={self.input_x: self.mnist.test.images, self.input_y: self.mnist.test.labels}))
        sess.close()

if __name__ == "__main__":
    SoftMax(constant)

'''
===============================
out:
step 0, train_accuracy 0.14
step 100, train_accuracy 0.84
step 200, train_accuracy 0.96
step 300, train_accuracy 0.9
step 400, train_accuracy 0.94
step 500, train_accuracy 0.9
···
step 9400, train_accuracy 0.92
step 9500, train_accuracy 0.94
step 9600, train_accuracy 0.92
step 9700, train_accuracy 0.9
step 9800, train_accuracy 0.94
step 9900, train_accuracy 0.88
test accuracy 0.9222
===============================
'''

卷积神经网络(CNN)

我们通过softmax回归取得了92%的准确率,似乎还不错,但实际上这个结果是比较差的,目前准确率最高应该达到了99.7%以上,So尝试了softmax之后,我们再来试下CNN,看究竟结果如何。 读取数据就不赘述了,与上面一样。因为我们准备尝试进行两次卷积和池化,所以为了让代码看起来更简洁些,我们将其以函数的形式写出:

代码语言:javascript复制
def weight_variable(self, shape):
   initial = tf.truncated_normal(shape, stddev=0.1)
   return tf.Variable(initial)

def bias_variable(self, shape):
   initial = tf.constant(0.1, shape=shape)
   return tf.Variable(initial)

def conv2d(self, x, W):
   return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")

def max_pool_2x2(self, x):
   return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
卷积&池化
  1. 首先第一层卷积和池化,输入为一个

的图像,经过拥有32个

卷积核的layer进行卷积,输出为32个

的图像,然后再经过池化层采样,最终输出为32个

的图像。

代码语言:javascript复制
#  第一层: 卷积
x_image = tf.reshape(self.input_x, [-1, 28, 28, 1])
w_cv1 = self.weight_variable([3, 3, 1, 32])
b_cv1 = self.bias_variable([32])
h_cv1 = tf.nn.relu(self.conv2d(x_image, w_cv1)   b_cv1)
h_mp1 = self.max_pool_2x
  1. 经过第一层卷积我们现在的输入的便是是32个

的图像,经过拥有64个

卷积核的layer进行第二次卷积,输出为64个

的图像(本应是输出

个,但为了降低复杂度,卷积层会对32个图像进行一次累加),再经过池化层采样,输出为64个

的图像。

代码语言:javascript复制
# 第二层: 卷积
w_cv2 = self.weight_variable([3, 3, 32, 64])
b_cv2 = self.bias_variable([64])
h_cv2 = tf.nn.relu(self.conv2d(h_mp1, w_cv2)   b_cv2)
h_mp2 = self.max_pool_2x2(h_cv2)
  1. 经过两层卷积,我们得到64个

的图像,将其展平得到一个

维的向量,输入到一个128维的全连接层,接着再输入到一个10维的softmax层,这部分与上面的softmax类似,代码如下:

代码语言:javascript复制
# 第三层: 全连接
W_fc1 = self.weight_variable([7*7*64, 128])
b_fc1 = self.bias_variable([128])

h_mp2_flat = tf.reshape(h_mp2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_mp2_flat, W_fc1)   b_fc1)

# 第四层: Dropout层
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# 第五层: softmax输出层
W_fc2 = self.weight_variable([128, 10])
b_fc2 = self.bias_variable([10])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2)   b_fc2)
  1. 优化器和损失函数这边选择Adam优化器和交叉熵。
代码语言:javascript复制
cross_entropy = -tf.reduce_sum(self.input_y * tf.log(y_conv))
train_step = tf.train.AdamOptimizer(self.constant.alpha).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1),tf.argmax(self.input_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
loss = tf.reduce_mean(cross_entropy)
  1. 最后跑起来吧,完整代码如下:
代码语言:javascript复制
# -*- coding: utf-8 -*-

# @author: Awesome_Tang
# @date: 2018-12-15
# @version: python2.7

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data


class constant(object):
    """
    CNN 模型参数
    """
    classes = 10  # 类别数
    alpha = 1e-4  # 学习率
    keep_prob = 0.5  # 保留比例
    steps = 10000  # 迭代次数
    batch_size = 50  # 每批次训练样本数
    tensorboard_dir = 'tensorboard/CNN'  # log输出路径
    print_per_batch = 100  # 每多少轮输出一次结果
    save_per_batch = 10  # 每多少轮存入tensorboard


class CNN():

    def __init__(self, constant):
        self.constant = constant
        self.mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
        self.input_x = tf.placeholder(tf.float32, [None, 784], name='input_x')
        self.input_y = tf.placeholder(
            tf.float32, [None, self.constant.classes], name='input_y')

        self.CNN_model()

    def weight_variable(self, shape):
        initial = tf.truncated_normal(shape, stddev=0.1)
        return tf.Variable(initial)

    def bias_variable(self, shape):
        initial = tf.constant(0.1, shape=shape)
        return tf.Variable(initial)

    def conv2d(self, x, W):
        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")

    def max_pool_2x2(self, x):
        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

    def CNN_model(self):
        #  第一层: 卷积
        x_image = tf.reshape(self.input_x, [-1, 28, 28, 1])
        w_cv1 = self.weight_variable([3, 3, 1, 32])
        b_cv1 = self.bias_variable([32])
        h_cv1 = tf.nn.relu(self.conv2d(x_image, w_cv1)   b_cv1)
        h_mp1 = self.max_pool_2x2(h_cv1)

        # 第二层: 卷积
        w_cv2 = self.weight_variable([3, 3, 32, 64])
        b_cv2 = self.bias_variable([64])
        h_cv2 = tf.nn.relu(self.conv2d(h_mp1, w_cv2)   b_cv2)
        h_mp2 = self.max_pool_2x2(h_cv2)

        # 第三层: 全连接
        W_fc1 = self.weight_variable([7*7*64, 128])
        b_fc1 = self.bias_variable([128])

        h_mp2_flat = tf.reshape(h_mp2, [-1, 7*7*64])
        h_fc1 = tf.nn.relu(tf.matmul(h_mp2_flat, W_fc1)   b_fc1)

        # 第四层: Dropout层
        keep_prob = tf.placeholder("float")
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

        # 第五层: softmax输出层
        W_fc2 = self.weight_variable([128, 10])
        b_fc2 = self.bias_variable([10])
        y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2)   b_fc2)

        # 优化器&损失函数
        cross_entropy = -tf.reduce_sum(self.input_y * tf.log(y_conv))
        train_step = tf.train.AdamOptimizer(
            self.constant.alpha).minimize(cross_entropy)
        correct_prediction = tf.equal(
            tf.argmax(y_conv, 1), tf.argmax(self.input_y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
        loss = tf.reduce_mean(cross_entropy)
        
        # tensorboard配置
        tf.summary.scalar("loss", loss)
        tf.summary.scalar("accuracy", accuracy)
        merged_summary = tf.summary.merge_all()
        writer = tf.summary.FileWriter(self.constant.tensorboard_dir)

        # 初始化变量
        sess = tf.Session()
        sess.run(tf.global_variables_initializer())

        for i in range(self.constant.steps):
            batch = self.mnist.train.next_batch(
                self.constant.batch_size)
            if i % self.constant.print_per_batch == 0:
                train_accuracy = accuracy.eval(session=sess,
                                               feed_dict={self.input_x: batch[0], self.input_y: batch[1], keep_prob: 1.0})
                print("step %d, train_accuracy %g" % (i, train_accuracy))
            train_step.run(session=sess, feed_dict={self.input_x: batch[0], self.input_y: batch[1],
                                                    keep_prob: self.constant.keep_prob})

            if i % self.constant.save_per_batch == 0:
                s = sess.run(merged_summary, feed_dict={
                             self.input_x: batch[0], self.input_y: batch[1], keep_prob: 1.0})
                writer.add_summary(s, i)

        print("test accuracy %g" % accuracy.eval(session=sess,
                                                 feed_dict={self.input_x: self.mnist.test.images, self.input_y: self.mnist.test.labels,
                                                            keep_prob: 1.0}))
        sess.close()

if __name__ == "__main__":
    CNN(constant)

'''
================================
out:
step 0, train_accuracy 0.04
step 100, train_accuracy 0.68
step 200, train_accuracy 0.74
step 300, train_accuracy 0.72
step 400, train_accuracy 0.78
step 500, train_accuracy 0.84
···
step 9000, train_accuracy 1
step 9100, train_accuracy 0.98
step 9200, train_accuracy 1
step 9300, train_accuracy 1
step 9400, train_accuracy 0.98
step 9500, train_accuracy 0.94
step 9600, train_accuracy 1
step 9700, train_accuracy 0.98
step 9800, train_accuracy 0.96
step 9900, train_accuracy 1
test accuracy 0.9826
================================
'''
  1. 迭代1w次之后准确率大概98%左右,相对于softmax回归还是有不少的提升。我们可以通过tensorboard看到整个训练过程中准确率和损失值的变化过程,总感觉这样的曲线好完美???

tensorboard

tip: 训练结果写入到tensorboard scalars的代码已经包含在上面了,执行后训练结果会保存在在根目录下面的tensorboard/CNN之中,在终端中执行tensorboard --logdir=tensorboard/CNN就可以看到了。

0 人点赞