在之前的文章中介绍了基于Logistic Regression实现Mnist数据集的多分类,本篇文章主要介绍基于TensorFlow实现Mnist数据集的多分类。
一个典型的神经网络训练图如下所示:
只不过在Mnist数据集是十分类的,起输出由y1和y2换成y1,....,y10。本文实现的神经网络如下所示:
这是使用的是两层的神经网络,第一层神经元个数是256,第二层为128,最终输出的是10个类别。对应的神经网络结果如下图所示:
接着我们创建一个MutilClass类,并初始化相关参数用来实现基于神经网络的多分类函数。
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
class MutilClass:
def __init__(self):
# 加载数据集
self.Mnsit = input_data.read_data_sets("./data/", one_hot=True)
# 设置神经网络层参数
self.n_hidden_1 = 256
self.n_hidden_2 = 128
self.n_input = 784
self.n_classes = 10
self.x = tf.placeholder(dtype=float, shape=[None, self.n_input], name="x")
self.y = tf.placeholder(dtype=float, shape=[None, self.n_classes], name="y")
# random_normal 高斯分布初始化权重
self.weights = {
"w1": tf.Variable(tf.random_normal([self.n_input, self.n_hidden_1],stddev = 0.1)),
"w2": tf.Variable(tf.random_normal([self.n_hidden_1, self.n_hidden_2], stddev = 0.1)),
"out": tf.Variable(tf.random_normal([self.n_hidden_2, self.n_classes], stddev = 0.1))
}
self.bias = {
"b1": tf.Variable(tf.random_normal([ self.n_hidden_1 ])),
"b2": tf.Variable(tf.random_normal([ self.n_hidden_2 ])),
"out": tf.Variable(tf.random_normal([ self.n_classes ]))
}
print("参数初始化完成!")
神经网络首次循环,是根据初始化的参数和偏置,向前传播,经过两层隐层,最终的到一个对应各个类别的概率,然后再根据反向传播,最小化损失函数求解参数,所以这里创建一个前向传播和反向传播的函数,如下所示:
# 定义一个MLP,前向感知器
def _multilayer_perceptron(self,_X, _weights, _bias):
layer_1 = tf.nn.sigmoid(tf.add ( tf.matmul(_X, _weights["w1"]), _bias["b1"] ) )
layer_2 = tf.nn.sigmoid(tf.add ( tf.matmul(layer_1, _weights["w2"]), _bias["b2"] ) )
return (tf.matmul( layer_2 ,_weights["out"] ) + _bias["out"])
# 定义反向传播
def _back_propagation(self):
pred = self._multilayer_perceptron(self.x, self.weights, self.bias)
# logits 未归一化的概率
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=self.y) )
optimizer = tf.train.GradientDescentOptimizer( learning_rate= 0.001).minimize(cost)
corr = tf.equal(tf.argmax(pred, 1), tf.argmax(self.y, 1) )
accr =tf.reduce_mean(tf.cast(corr, dtype=float))
init = tf.global_variables_initializer()
return init, optimizer,cost, accr
接着就是训练网络了,指定的迭代次数为:100,batch_size:100,其对应的函数未:
# 训练模型
def _train_model(self, _init, _optimizer, _cost, _accr):
epochs = 100
batch_size = 100
display_steps = 1
sess = tf.Session()
sess.run(_init)
for epoch in range(epochs):
avg_cost = 0
total_batch = int (self.Mnsit.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = self.Mnsit.train.next_batch(batch_size)
feeds = {self.x: batch_xs, self.y: batch_ys}
sess.run(_optimizer, feed_dict=feeds)
avg_cost += sess.run(_cost, feed_dict=feeds)
avg_cost = avg_cost / total_batch
if (epoch +1) % display_steps ==0:
print("Epoch: {} / {}, cost: {}".format(epoch, epochs, avg_cost))
feeds = {self.x: batch_xs, self.y: batch_ys}
train_acc = sess.run(_accr, feed_dict=feeds)
print("Train Accuracy: {}".format(train_acc))
feeds = {self.x : self.Mnsit.test.images, self.y: self.Mnsit.test.labels}
test_acc = sess.run(_accr, feed_dict= feeds)
print("Test Accuracy: {}".format(test_acc))
print("-" * 50)
创建主函数,进行迭代训练
if __name__ == "__main__":
network = MutilClass()
init, optimizer, cost, accr = network._back_propagation()
network._train_model(init, optimizer,cost, accr)
最后的迭代结果为:
Epoch: 0 / 100, cost: 2.4407591546665537
Train Accuracy: 0.12999999523162842
Test Accuracy: 0.12960000336170197
--------------------------------------------------
Epoch: 1 / 100, cost: 2.290777679356662
Train Accuracy: 0.12999999523162842
Test Accuracy: 0.1469999998807907
--------------------------------------------------
Epoch: 2 / 100, cost: 2.2774649468335237
Train Accuracy: 0.17000000178813934
Test Accuracy: 0.21799999475479126
--------------------------------------------------
.......
--------------------------------------------------
Epoch: 98 / 100, cost: 0.7186844098567963
Train Accuracy: 0.8299999833106995
Test Accuracy: 0.8371999859809875
--------------------------------------------------
Epoch: 99 / 100, cost: 0.7124480505423112
Train Accuracy: 0.8100000023841858
Test Accuracy: 0.8377000093460083
--------------------------------------------------
从结果中可以看出,cost是一直在减少,训练集和测试集评估模型的准确率也在一直提高。当然我们也可以通过调节epoch,batch_size来重新训练模型。