从多元回归到逻辑回归:
多元线性回归:
一个因变量y和一组自变量x1, x2, x3, ... , xn,其中y为连续变量,我们可以拟合一个线性方程:
y =θ0 +θ1*x1 +θ2*x2 +θ3*x3 +...+θn*xn 取值范围(-∞ ,+∞)
逻辑回归(Logistic Regression, LR)
定义:
如果 y 现在取值为一个有限范围内[ ] ,
假设y只能取值 +1 or 0 ,上面方程右侧是一个连续的值,取值范围(-∞ ,+∞),而左侧取值 0,+1两边无法对应?
其实这个问题变成一个分类问题(二分类问题)
y = +1 正实例点
y = 0 负实例点
为了这个多元线性模型两边值域对应上。考虑找这样一个函数f(x)?
1: f(x)能将任何数值映射到(0,1),也就是它值域为(0,1)
2: f(x) 且具有无限阶可导
于是就有了大名鼎鼎的Sigmoid函数:
f(z)= 1 / (1+e-z)
f(z)的性质
- 值域(0,1)
- 求导 :f(z) ' = f(z) *(1 - f(z) ) 即:f' =f *(1-f) 导数非常方便计算
其实逻辑回归就是在线性回归的基础上,再加上了一个逻辑函数,就这么简单
y =θ0 +θ1*x1 +θ2*x2 +θ3*x3 +...+θn*xn 取值范围(-∞ ,+∞)
f(y) =1 / (1+e-y)
为什么不使用均方差损失函数?
将Sigmoid函数带入,J(θ) 是非凸函数,有多个极小值。
如果采用梯度下降法,会容易导致 陷入局部最优解中
逻辑回归模型建立:
-----------------------------------------------------------------------------
eg: 鸢尾花数据集分类:
包含4个特征:Sepal.Length(花萼长度)Sepal.Width(花萼宽度) Petal.Length(花瓣长度)Petal.Width(花瓣宽度))
特征值都为正浮点数,单位为厘米
标签:种类:0 : setosa 1: versicolor 2: virginica
二分类 选取种类 0 ,1 前100条 . 下面例子是二分类 setosa versicolor
-----------------------------------------------------------------------------
环境:
单个神经元模型,一轮喂入 全部训练数据,自己函数实现
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.datasets import load_iris #导入IRIS数据集
from sklearn.model_selection import train_test_split
from time import time
iris = load_iris() #导入鸢尾花数据集
x = iris.data #获取数据
y = iris.target #数据labe
#二分类 选取种类 0 ,1
x = x[:100] #使用数据集的前100条
y = y[:100]
#划分训练集和测试集,这里通过test_size设置测试集占比20%,
traing_data, test_data , traing_label ,test_label =train_test_split(x,y,test_size=0.2)
#标准化
def Data_Standard1(data):
"""
data: 标准化数据集
"""
mean = data.mean(axis=0)
std = data.std(axis=0)
res = (data -mean) /std
return res
#特征值标准化
traing_data_st =Data_Standard1(traing_data)
test_data_st = Data_Standard1(test_data)
traing_label = traing_label.reshape([traing_label.shape[0],1])
test_label = test_label.reshape([test_label.shape[0],1])
n_traing = traing_data_st.shape[0] # 训练数据个数
n_feature = traing_data_st.shape[1] # 特征的维数
#超参数:
learing_rate = 0.01
traing_epoch = 2000 # 训练轮数
loss_histroy = [] # 损失记录
#训练参数
w = tf.Variable(np.zeros([n_feature,1]))
def sigmoid(x):
return 1.0 /(1 + np.exp(x))
def model(data , w):
res =sigmoid(tf.matmul(data,w))
return res
#开始训练
start_Time=time()
for epoch in range(traing_epoch+1):
predict = model(traing_data_st,w)
#计算损失
loss =-(1/n_traing)*np.sum(traing_label*np.log(predict) +(1-traing_label*np.log(1-predict)))
loss_histroy.append(loss)
error = predict - traing_label #分类错误点
gradiet =np.dot(traing_data_st.reshape([4,n_traing]),error) #计算梯度
w = w -learing_rate *gradiet #参数更新
#show run time
duration = time() -start_Time
print("Train Finished take %5f "%(duration))
# 预测
def get_accuracy(data,lable):
'''
data: 是测试数据
lable: 测试数据标签
'''
pred =[]
pre = model(data,w) # 预测
for i in range(len(data)):
if pre[i]>= 0.5 : # 分类1
pred.append(1)
else :
pred.append(0) #分类0
t = np.array(pred)
pred = t.reshape([t.shape[0],1])
predciton_result= tf.equal(pred,lable)
accuracy = tf.reduce_mean(tf.cast(predciton_result,tf.float32))
return accuracy,pred
test_accuracy,pred =get_accuracy(test_data_st,test_label)
print(test_accuracy.numpy())
print("标签值:",test_label.T)
print('预测值:',pred.T)
import matplotlib.pyplot as plt
plt.plot(loss_histroy)
跑得最好的一次,这段程序有个小bug,在预测那个函数,分类 0,1
多分类逻辑回归softmax:
-----------------------------------------------------------------------------
eg: 手写数字图片集
数据集中:
traing_image: 60000 28 *28灰度图像
traing_label: 60000 0-9 数字
test_image: 10000 28*28 28 *28灰度图像
test_label : 10000 0-9 数字
图像实际数字 n [0-9]
分别用不同方法实现:
1: 单个神经元模型,一轮喂入 全部训练数据 (自己写的函数)
2: 每一轮数据 ,分成一批一批喂入 (自己写的函数)
3: 一层神经网络(一层隐藏层),分成一批一批喂入
4:多层神经元 (2层隐藏层) 每次全部喂入 层数可以一直加下去
5: 5: keras: 实现 (准则率最高 ,自己写的最好准确率到93%多,keras,直接上99%多,厉害)
----------------------------------------------------------------------------------
1:单个神经元模型,一轮喂入 全部训练数据
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
from time import time
import matplotlib.pyplot as plt
# 下载数据集
"""
数据集中:
训练数据 traing_image: 60000 28 *28灰度图像 test_image:10000 28*28
traing_label: 60000 图像实际数字 n [0-9] test_iamge:10000 n [0-9]
"""
mnist = tf.keras.datasets.mnist
(train_image,train_lable) ,(test_image,test_lable) =mnist.load_data() #加载数据集
num_train = train_image.shape[0] # 训练集数据个数
num_test =test_image.shape[0] #测试集数据个数
# 归一化处理 , 28 * 28 = 684个像素点都看成特征,即一个图像有 784维特征
train_image_reg = (train_image / 255.0).reshape([num_train,784])
test_image_reg = (test_image /255.0).reshape([num_test,784])
#onehot 编码
train_lable_ohot =tf.one_hot(train_lable ,depth =10 ).numpy()
test_lable_ohot = tf.one_hot(test_lable ,depth =10).numpy()
#超参数
train_epochs = 2000 #训练次数
learning_rate = 0.01 #学习率
#训练参数初始化
#w = tf.Variable(np.random.normal(0,1,[784,10]))
w = tf.Variable(np.zeros([784,10]))
b = tf.Variable(np.zeros([10]))
optimizer = tf.keras.optimizers.SGD(learning_rate) # 随机梯度下降
# 交叉熵
def loss_Function(pred, y_lable):
tmp =y_lable*tf.math.log(pred)
t= tf.reduce_sum(tmp,axis=1)
res =tf.reduce_mean(-t)
return res
#损失记录
loss_history = []
#开始训练
start_Time= time()
for epoch in range(train_epochs+1):
with tf.GradientTape() as tape:
tape.watch([w,b])
forward =tf.matmul(train_image_reg,w)+ b
pred = tf.nn.softmax(forward) #softmax 分类器
loss = loss_Function(pred,train_lable_ohot)
loss_history.append(loss)
gradients = tape.gradient(target=loss,sources=[w,b])
optimizer.apply_gradients(zip(gradients,[w,b])) # 将计算的梯度更新
if epoch % 20==0 :
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(train_lable_ohot,1)) #预测是否正确
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) #准确率
print('epoch: %d loss : %f accuracy : %f'%(epoch,loss,accuracy))
duration = time() -start_Time
print("Train Finished take %5f "%(duration))
# 验证预测结果
def get_accuracy(image,lable):
forward = tf.matmul(image,w)+ b
pred = tf.nn.softmax(forward)
predciton =tf.argmax(pred,1)
predciton_result= tf.equal(predciton,tf.argmax(lable,1))
accuracy = tf.reduce_mean(tf.cast(predciton_result,tf.float32))
return accuracy,predciton
accuracy,predciton =get_accuracy(test_image_reg ,test_lable_ohot)
print('实际数字:',test_lable[0:20])
print('预测数字',predciton[0:20].numpy())
print("准确率:",accuracy.numpy())
#验证结果可视化
def plot_image_prediction(image,
lable,
prediction,
index,num=10):
fig =plt.gcf()
fig.set_size_inches(10,12)
if num >25 :
num = 25
for i in range(0,num):
ax = plt.subplot(2,5,i+1)
ax.imshow(np.reshape(image[index],(28,28)),cmap='binary')
title = "lable ="+str(np.argmax(lable[index]))
if len(prediction)>0:
title += ",predict ="+str(prediction[index].numpy())
ax.set_title(title,fontsize =10)
ax.set_xticks([])
ax.set_yticks([])
index +=1
plot_image_prediction(test_image,test_lable_ohot,predciton,5,10)
plt.plot(loss_history)
2: 每一轮数据 ,分成一批一批喂入
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
from time import time
# 下载数据集
"""
数据集中:
训练数据 traing_image: 60000 28 *28灰度图像 test_image:10000 28*28
traing_label: 60000 图像实际数字 n [0-9] test_iamge:10000 n [0-9]
"""
mnist = tf.keras.datasets.mnist
(train_image,train_lable) ,(test_image,test_lable) =mnist.load_data() #加载数据集
num_train = train_image.shape[0] # 训练集数据个数
num_test =test_image.shape[0] #测试集数据个数
# 归一化处理 , 28 * 28 = 684个像素点都看成特征,即一个图像有 784维特征
train_image_reg = (train_image / 255.0).reshape([num_train,784])
test_image_reg = (test_image /255.0).reshape([num_test,784])
#onehot 编码
train_lable_ohot =tf.one_hot(train_lable ,depth =10 ).numpy()
test_lable_ohot = tf.one_hot(test_lable ,depth =10).numpy()
#超参数
train_epochs = 2000 #训练轮数
batch_size = 10000 #每论训练,喂入一批数据个数
learning_rate = 0.02
#训练参数
w = tf.Variable(np.zeros([784,10]))
b = tf.Variable(np.zeros([10]))
optimizer = tf.keras.optimizers.SGD(learning_rate) # 随机梯度下降
# 交叉熵
def loss_Function(pred, y_lable):
tmp =y_lable*tf.math.log(pred)
t= tf.reduce_sum(tmp,axis=1)
res =tf.reduce_mean(-t)
return res
loss_history = []
#开始训练
start_Time= time()
for epoch in range(train_epochs):
index =list(range(num_train))
for i in range(0,num_train,batch_size):
j = np.array(index[i:min(i+batch_size,num_train)])
x,y =train_image_reg[j],train_lable_ohot[j]
with tf.GradientTape() as tape:
forward = tf.matmul(x,w)+ b
pred = tf.nn.softmax(forward) #softmax 分类器
loss = loss_Function(pred,y)
gradients = tape.gradient(target=loss,sources=[w,b])
optimizer.apply_gradients(zip(gradients,[w,b])) # 将计算的梯度更新
if epoch % 20==0 :
forward_total =tf.matmul(train_image_reg,w)+ b
pred_total = tf.nn.softmax(forward_total) #softmax 分类器
loss = loss_Function(pred_total,train_lable_ohot)
loss_history.append(loss)
correct_prediction = tf.equal(tf.argmax(pred_total,1),tf.argmax(train_lable_ohot,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('epoch: %d loss : %f accuracy : %f'%(epoch,loss,accuracy))
duration = time() -start_Time
print("Train Finished take %5f "%(duration))
# 验证预测结果
def get_accuracy(image,lable):
forward = tf.matmul(image,w)+ b
pred = tf.nn.softmax(forward)
predciton =tf.argmax(pred,1)
predciton_result= tf.equal(predciton,tf.argmax(lable,1))
accuracy = tf.reduce_mean(tf.cast(predciton_result,tf.float32))
return accuracy,predciton
accuracy,predciton =get_accuracy(test_image_reg ,test_lable_ohot)
print('实际数字:',test_lable[0:20])
print('预测数字',predciton[0:20].numpy())
print("准确率:",accuracy.numpy())
#预测数据可视化
import matplotlib.pyplot as plt
def plot_image_prediction(image,
lable,
prediction,
index,num=10):
fig =plt.gcf()
fig.set_size_inches(10,12)
if num >25 :
num = 25
for i in range(0,num):
ax = plt.subplot(2,5,i+1)
ax.imshow(np.reshape(image[index],(28,28)),cmap='binary')
title = "lable ="+str(np.argmax(lable[index]))
if len(prediction)>0:
title += ",predict ="+str(prediction[index].numpy())
ax.set_title(title,fontsize =10)
ax.set_xticks([])
ax.set_yticks([])
index +=1
plot_image_prediction(test_image,test_lable_ohot,predciton,5,10)
3: 一层神经网络(一层隐藏层),分成一批一批喂入
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
from time import time
# 下载数据集
"""
数据集中:
训练数据 traing_image: 60000 28 *28灰度图像 test_image:10000 28*28
traing_label: 60000 图像实际数字 n [0-9] test_iamge:10000 n [0-9]
"""
mnist = tf.keras.datasets.mnist
(train_image,train_lable) ,(test_image,test_lable) =mnist.load_data() #加载数据集
num_train = train_image.shape[0] # 训练集数据个数
num_test =test_image.shape[0] #测试集数据个数
# 归一化处理 , 28 * 28 = 684个像素点都看成特征,即一个图像有 784维特征
train_image_reg = (train_image / 255.0).reshape([num_train,784])
test_image_reg = (test_image /255.0).reshape([num_test,784])
#onehot 编码
train_lable_ohot =tf.one_hot(train_lable ,depth =10 ).numpy()
test_lable_ohot = tf.one_hot(test_lable ,depth =10).numpy()
def loss_Function2(forward, y):
res =tf.nn.softmax_cross_entropy_with_logits(logits=forward ,labels=y)
return tf.reduce_mean(res)
#超参数
train_epochs = 1000 #训练轮数
batch_size = 30000 #每论训练,喂入一批数据个数
learning_rate = 0.01
optimizer = tf.keras.optimizers.SGD(learning_rate) # 随机梯度下降
#隐藏层
H1_NN =256
w1 = tf.Variable(np.random.normal(0,1,[784,H1_NN]))
#w1 = tf.Variable(np.zeros([784,H1_NN]))
b1 = tf.Variable(np.zeros([H1_NN]))
#输出层
w2 = tf.Variable(np.zeros([H1_NN,10]))
b2 = tf.Variable(np.zeros([10]))
loss_history =[]
#开始训练
start_Time=time()
for epoch in range(train_epochs+1):
index =list(range(num_train))
for i in range(0,num_train,batch_size):
j = np.array(index[i:min(i+batch_size,num_train)])
x,y =train_image_reg[j],train_lable_ohot[j]
with tf.GradientTape() as tape:
tape.watch([w1,b1,w2,b2])
y1 = tf.nn.relu(tf.matmul(x,w1)+b1)
forward = tf.matmul(y1,w2)+b2
loss = loss_Function2(forward,y) # loss不做分类处理
pred = tf.nn.softmax(forward)
gradients = tape.gradient(target=loss,sources=[w1,b1,w2,b2])
optimizer.apply_gradients(zip(gradients,[w1,b1,w2,b2])) # 将计算的梯度更新
if epoch %10 ==0:
y1_total = tf.nn.relu(tf.matmul(train_image_reg,w1)+b1)
forward_total = tf.matmul(y1_total,w2)+b2
loss = loss_Function2(forward_total,train_lable_ohot)
pred_total = tf.nn.softmax(forward_total)
loss_history.append(loss)
correct_prediction = tf.equal(tf.argmax(pred_total,1),tf.argmax(train_lable_ohot,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('epoch: %d loss : %f accuracy : %f'%(epoch,loss,accuracy))
#show run time
duration = time() -start_Time
print("Train Finished take %5f "%(duration))
# 验证预测结果
def get_accuracy(image,lable):
y1 = tf.matmul(image,w1)+ b1
forward = tf.matmul(y1,w2)+b2
pred = tf.nn.softmax(forward)
predciton =tf.argmax(pred,1)
predciton_result= tf.equal(predciton,tf.argmax(lable,1))
accuracy = tf.reduce_mean(tf.cast(predciton_result,tf.float32))
return accuracy,predciton
accuracy,predciton =get_accuracy(test_image_reg ,test_lable_ohot)
print('实际值: 输出前20',test_lable[0:20])
print('预测值: 输出前20',pred
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
from time import time
# 下载数据集
"""
数据集中:
训练数据 traing_image: 60000 28 *28灰度图像 test_image:10000 28*28
traing_label: 60000 图像实际数字 n [0-9] test_iamge:10000 n [0-9]
"""
mnist = tf.keras.datasets.mnist
(train_image,train_lable) ,(test_image,test_lable) =mnist.load_data() #加载数据集
num_train = train_image.shape[0] # 训练集数据个数
num_test =test_image.shape[0] #测试集数据个数
# 归一化处理 , 28 * 28 = 684个像素点都看成特征,即一个图像有 784维特征
train_image_reg = (train_image / 255.0).reshape([num_train,784])
test_image_reg = (test_image /255.0).reshape([num_test,784])
#onehot 编码
train_lable_ohot =tf.one_hot(train_lable ,depth =10 ).numpy()
test_lable_ohot = tf.one_hot(test_lable ,depth =10).numpy()
train_epochs = 50 #训练次数
learning_rate = 0.02 #学习率
#构建模型
H1_NN = 256 # 第一层神经元个数
H2_NN = 128 # 第二层神经元个数
#input_layer
"""
w1 = tf.Variable(tf.random.truncated_normal([784,H1_NN],stddev=0.1))
w1 =tf.cast(w1,dtype=np.float64)
b1 = tf.Variable(tf.zeros([H1_NN]))
b1 = tf.cast(b1,dtype=np.float64)
#第一层
#print(w1.shape,w1.dtype,b1.dtype)
w2 = tf.Variable(tf.random.truncated_normal([H1_NN,H2_NN],stddev=0.1))
w2 =tf.cast(w2,dtype=np.float64)
b2 = tf.Variable(tf.zeros([H2_NN]))
b2 = tf.cast(b2,dtype=np.float64)
#第二层
w3 = tf.Variable(tf.random.truncated_normal([H2_NN,10],stddev=0.1))
w3 =tf.cast(w3,dtype=np.float64)
b3 = tf.Variable(np.zeros([10]))
b3 = tf.cast(b3,dtype=np.float64)
"""
w1 = tf.Variable(np.random.normal(0,1,[784,H1_NN]))
b1 = tf.Variable(np.zeros([H1_NN]))
w2 = tf.Variable(np.random.normal(0,1,[H1_NN,H2_NN]))
b2 = tf.Variable(np.zeros([H2_NN]))
w3 = tf.Variable(np.zeros([H2_NN,10]))
b3 = tf.Variable(np.zeros([10]))
optimizer = tf.keras.optimizers.SGD(learning_rate) # 随机梯度下降
# 交叉熵
def loss_Function3(forward, y):
res =tf.nn.softmax_cross_entropy_with_logits(logits=forward ,labels=y)
return tf.reduce_mean(res)
#损失记录
loss_history = []
#开始训练
start_Time=time()
for epoch in range(train_epochs):
with tf.GradientTape() as tape:
tape.watch([w1,b1,w2,b2,w3,b3])
y1 = tf.nn.relu(tf.matmul(train_image_reg,w1)+b1)
y2 = tf.nn.relu(tf.matmul(y1,w2)+b2)
forward = tf.matmul(y2,w3)+b3
loss = loss_Function3(forward,train_lable_ohot) # 不做分类处理
pred = tf.nn.softmax(forward)
gradients = tape.gradient(target=loss,sources=[w1,b1,w2,b2,w3,b3])
optimizer.apply_gradients(zip(gradients,[w1,b1,w2,b2,w3,b3])) # 将计算的梯度更新
if epoch %5 ==0:
loss_history.append(loss)
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(train_lable_ohot,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('epoch: %d loss : %f accuracy : %f'%(epoch,loss,accuracy))
#show run time
duration = time() -start_Time
print("Train Finished take %5f "%(duration))
# 验证预测结果
def get_accuracy(image,lable):
y1 = tf.nn.relu(tf.matmul(image,w1)+b1)
y2 = tf.nn.relu(tf.matmul(y1,w2)+b2)
forward = tf.matmul(y2,w3)+b3
loss = loss_Function3(forward,lable) # 不做分类处理
pred = tf.nn.softmax(forward)
predciton =tf.argmax(pred,1)
predciton_result= tf.equal(predciton,tf.argmax(lable,1))
accuracy = tf.reduce_mean(tf.cast(predciton_result,tf.float32))
return accuracy,predciton
accuracy,predciton =get_accuracy(test_image_reg ,test_lable_ohot)
print('预测值: 输出前20',predciton[0:20].numpy())
print('实际值: 输出前20',train_lable[0:20])
print(accuracy.numpy())
5: keras: 实现
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers
# 加载训练集
mnist = tf.keras.datasets.mnist
(train_image, train_label),(test_image, test_label) = mnist.load_data()
train_image, test_image = train_image / 255.0, test_image / 255.0
#创建模型
def create_model():
return tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Dense(10, activation='softmax')
])
model = create_model()
# 优化方法,损失函数
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 模型训练
model.fit(x=train_image,
y=train_lable,
epochs=10,
)
#模型测试
loss, acc = model.evaluate(test_image, test_lable)
参考:
本人也是入门选手,不是大牛。不是谦虚,中国人才济济,发自内心自叹不如。不过有一群人无私奉献,写的博客质量真不错。发现通过阅读博客,也能学到很多东西,然后自己再总结。一种新的学习方法! 内容纯粹是总结学习经验,加深理解。大神,请略过!