李宏毅(2020)作业3-hw3_CNN_cnn

文章目录


数据集 链接:

​https://pan.baidu.com/s/1OYqpIQ4N57RY2UjTf8OvOg​


提取码:

​csdn​

数据集介绍

李宏毅(2020)作业3-hw3_CNN_pytorch_02


有训练集、验证集、测试集

训练集和验证集带标签,测试集不带标签

李宏毅(2020)作业3-hw3_CNN_python_03


测试集图片命名仅为序号,无实际意义

李宏毅(2020)作业3-hw3_CNN_卷积神经网络_04


训练集和验证集图片命名规则为“类别_序号”

这是一个11分类问题

Homework 3 - Convolutional Neural Network

!gdown --id '19CzXudqN58R3D-1G8KeFWk8UDQwlb8is' --output food-11.zip # 下载数据集
!unzip food-11.zip # 解压缩
Downloading...
From: https://drive.google.com/uc?id=19CzXudqN58R3D-1G8KeFWk8UDQwlb8is
To: /content/food-11.zip
100% 1.16G/1.16G [00:06<00:00, 184MB/s]
Archive: food-11.zip
replace food-11/testing/0071.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename:
# Import需要的套件
import os
import numpy as np
import cv2
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import pandas as pd
from torch.utils.data import DataLoader, Dataset
import

#Read image
利用OpenCV(cv2)读入照片并存放在numpy array中

def readfile(path, label):
# label是一个boolean variable,代表需不需要回传y值
image_dir = sorted(os.listdir(path))
x = np.zeros((len(image_dir), 128, 128, 3), dtype=np.uint8)
y = np.zeros((len(image_dir)), dtype=np.uint8)
for i, file in enumerate(image_dir):
img = cv2.imread(os.path.join(path, file))
x[i, :, :] = cv2.resize(img,(128, 128))
if label:
y[i] = int(file.split("_")[0])# 标签
if label:
return x, y
else:
return
# 分别将training set、validation set、testing set用readfile函数读进来
workspace_dir = './food-11'
print("Reading data")
train_x, train_y = readfile(os.path.join(workspace_dir, "training"), True)
print("Size of training data = {}".format(len(train_x)))
val_x, val_y = readfile(os.path.join(workspace_dir, "validation"), True)
print("Size of validation data = {}".format(len(val_x)))
test_x = readfile(os.path.join(workspace_dir, "testing"), False)
print("Size of Testing data = {}".format(len(test_x)))
Reading data
Size of training data = 9866
Size of validation data = 3430
Size of Testing data = 3347





array([0, 0, 0, ..., 9, 9, 9], dtype=uint8)

Dataset

在PyTorch中,我们可以利用torch.utils.data的Dataset及DataLoader来“包装”data,使后续的training及testing更为方便。

Dataset需要overload两个函数:len__及__getitem

__len__必须要回传dataset的大小,而__getitem__则定义了当程序利用[ ]取值时,dataset应该要怎么回传数据。

实际上我们并不会直接使用到这两个函数,但是使用DataLoader在enumerate Dataset时会使用到,没有实做的话会在程序运行阶段出现error。

# training时做数据扩充
train_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(), # 随机将图片水平翻转
transforms.RandomRotation(15), # 随机旋转图片
transforms.ToTensor(), # 将图片转成Tensor,并把数值normalize到[0,1](data normalization)
])
# testing时不需要做数据扩充
test_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.ToTensor(),
])
class ImgDataset(Dataset):
def __init__(self, x, y=None, transform=None):
self.x = x
# label is required to be a LongTensor
self.y = y
if y is not None:
self.y = torch.LongTensor(y)
self.transform = transform
def __len__(self):
return len(self.x)
def __getitem__(self, index):
X = self.x[index]
if self.transform is not None:
X = self.transform(X)
if self.y is not None:
Y = self.y[index]
return X, Y
else:
return
batch_size = 128
train_set = ImgDataset(train_x, train_y, train_transform)
val_set = ImgDataset(val_x, val_y, test_transform)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_set, batch_size=batch_size, shuffle=False)

Model

为了更好的理解神经网络,给出可视化的网络结构图

李宏毅(2020)作业3-hw3_CNN_cnn_05


网络结构的维度

李宏毅(2020)作业3-hw3_CNN_python_06

class Classifier(nn.Module):
def __init__(self):
super(Classifier, self).__init__()
# torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
# torch.nn.MaxPool2d(kernel_size, stride, padding)
# input 維度 [3, 128, 128]
self.cnn = nn.Sequential(
nn.Conv2d(3, 64, 3, 1, 1), # [64, 128, 128]
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(2, 2, 0), # [64, 64, 64]

nn.Conv2d(64, 128, 3, 1, 1), # [128, 64, 64]
nn.BatchNorm2d(128),
nn.ReLU(),
nn.MaxPool2d(2, 2, 0), # [128, 32, 32]

nn.Conv2d(128, 256, 3, 1, 1), # [256, 32, 32]
nn.BatchNorm2d(256),
nn.ReLU(),
nn.MaxPool2d(2, 2, 0), # [256, 16, 16]

nn.Conv2d(256, 512, 3, 1, 1), # [512, 16, 16]
nn.BatchNorm2d(512),
nn.ReLU(),
nn.MaxPool2d(2, 2, 0), # [512, 8, 8]

nn.Conv2d(512, 512, 3, 1, 1), # [512, 8, 8]
nn.BatchNorm2d(512),
nn.ReLU(),
nn.MaxPool2d(2, 2, 0), # [512, 4, 4]
)
self.fc = nn.Sequential(
nn.Linear(512*4*4, 1024),
nn.ReLU(),
nn.Linear(1024, 512),
nn.ReLU(),
nn.Linear(512, 11)
)

def forward(self, x):
out = self.cnn(x)
out = out.view(out.size()[0], -1)
return self.fc(out)

Training

使用training set训练,并使用validation set寻找好的参数

model = Classifier().cuda()
loss = nn.CrossEntropyLoss() # 因为是分类任务,所以使用交叉熵损失
optimizer = torch.optim.Adam(model.parameters(), lr=0.001) # optimizer 使用 Adam
num_epoch = 30

for epoch in range(num_epoch):
epoch_start_time = time.time()
train_acc = 0.0
train_loss = 0.0
val_acc = 0.0
val_loss = 0.0

model.train() # 确保model是在train model(开启Dropout等…)
for i, data in enumerate(train_loader):
optimizer.zero_grad() # 用optimizer将model参数的gradient归零
train_pred = model(data[0].cuda()) # 利用model得到预测的机率分布这边实际上就是去呼叫model的forward函数
batch_loss = loss(train_pred, data[1].cuda()) # 计算loss(注意prediction跟label必须同时在CPU或是GPU上)
batch_loss.backward() # 利用back propagation算出每个参数的gradient
optimizer.step() # 以optimizer用gradient更新参数值

train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
train_loss += batch_loss.item()

model.eval()
with torch.no_grad():# 在这种情况下,可以使用 with torch.no_grad():,强制之后的内容不进行计算图构建
for i, data in enumerate(val_loader):
val_pred = model(data[0].cuda())
batch_loss = loss(val_pred, data[1].cuda())

val_acc += np.sum(np.argmax(val_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
val_loss += batch_loss.item()

#将结果print出来
print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f | Val Acc: %3.6f loss: %3.6f' % \
(epoch + 1, num_epoch, time.time()-epoch_start_time, \
train_acc/train_set.__len__(), train_loss/train_set.__len__(), val_acc/val_set.__len__(), val_loss/val_set.__len__()))
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


[001/030] 44.33 sec(s) Train Acc: 0.233732 Loss: 0.017737 | Val Acc: 0.300875 loss: 0.015240
[002/030] 44.51 sec(s) Train Acc: 0.340564 Loss: 0.014769 | Val Acc: 0.349271 loss: 0.015372
[003/030] 45.06 sec(s) Train Acc: 0.401885 Loss: 0.013530 | Val Acc: 0.347522 loss: 0.015401
[004/030] 45.24 sec(s) Train Acc: 0.430874 Loss: 0.012935 | Val Acc: 0.328863 loss: 0.016378
[005/030] 45.30 sec(s) Train Acc: 0.475269 Loss: 0.012048 | Val Acc: 0.413120 loss: 0.012980
[006/030] 45.33 sec(s) Train Acc: 0.507095 Loss: 0.011106 | Val Acc: 0.454519 loss: 0.012729
[007/030] 45.49 sec(s) Train Acc: 0.530103 Loss: 0.010717 | Val Acc: 0.416910 loss: 0.015084
[008/030] 45.43 sec(s) Train Acc: 0.541557 Loss: 0.010377 | Val Acc: 0.440816 loss: 0.013960
[009/030] 45.46 sec(s) Train Acc: 0.571458 Loss: 0.009788 | Val Acc: 0.487755 loss: 0.011974
[010/030] 45.50 sec(s) Train Acc: 0.598115 Loss: 0.009207 | Val Acc: 0.455977 loss: 0.014011
[011/030] 45.37 sec(s) Train Acc: 0.612406 Loss: 0.008679 | Val Acc: 0.512245 loss: 0.011949
[012/030] 45.34 sec(s) Train Acc: 0.638658 Loss: 0.008318 | Val Acc: 0.518659 loss: 0.012352
[013/030] 45.46 sec(s) Train Acc: 0.651429 Loss: 0.007931 | Val Acc: 0.581924 loss: 0.010126
[014/030] 45.38 sec(s) Train Acc: 0.666329 Loss: 0.007601 | Val Acc: 0.584840 loss: 0.010021
[015/030] 45.23 sec(s) Train Acc: 0.683661 Loss: 0.007158 | Val Acc: 0.587464 loss: 0.010242
[016/030] 45.01 sec(s) Train Acc: 0.698561 Loss: 0.006867 | Val Acc: 0.505831 loss: 0.012230
[017/030] 44.88 sec(s) Train Acc: 0.710521 Loss: 0.006546 | Val Acc: 0.638192 loss: 0.008705
[018/030] 44.97 sec(s) Train Acc: 0.739003 Loss: 0.005980 | Val Acc: 0.616910 loss: 0.009190
[019/030] 44.85 sec(s) Train Acc: 0.736874 Loss: 0.006077 | Val Acc: 0.587755 loss: 0.010634
[020/030] 44.80 sec(s) Train Acc: 0.728563 Loss: 0.006111 | Val Acc: 0.617201 loss: 0.009349
[021/030] 44.84 sec(s) Train Acc: 0.759072 Loss: 0.005414 | Val Acc: 0.632653 loss: 0.009644
[022/030] 44.93 sec(s) Train Acc: 0.773971 Loss: 0.005038 | Val Acc: 0.595627 loss: 0.010954
[023/030] 44.90 sec(s) Train Acc: 0.792824 Loss: 0.004670 | Val Acc: 0.622449 loss: 0.010215
[024/030] 45.00 sec(s) Train Acc: 0.800932 Loss: 0.004595 | Val Acc: 0.606997 loss: 0.011122
[025/030] 44.93 sec(s) Train Acc: 0.803365 Loss: 0.004412 | Val Acc: 0.649563 loss: 0.009798
[026/030] 44.91 sec(s) Train Acc: 0.825968 Loss: 0.003959 | Val Acc: 0.605248 loss: 0.011836
[027/030] 44.94 sec(s) Train Acc: 0.818265 Loss: 0.004115 | Val Acc: 0.665598 loss: 0.009107
[028/030] 44.84 sec(s) Train Acc: 0.846442 Loss: 0.003419 | Val Acc: 0.602332 loss: 0.012661
[029/030] 44.99 sec(s) Train Acc: 0.836104 Loss: 0.003747 | Val Acc: 0.609913 loss: 0.012081
[030/030] 44.84 sec(s) Train Acc: 0.857389 Loss: 0.003166 | Val Acc: 0.655977 loss: 0.010065

得到好的参数后,我们使用training set和validation set共同训练(数据量变多,模型效果较好)

train_val_x = np.concatenate((train_x, val_x), axis=0)
train_val_y = np.concatenate((train_y, val_y), axis=0)
train_val_set = ImgDataset(train_val_x, train_val_y, train_transform)
train_val_loader = DataLoader(train_val_set, batch_size=batch_size, shuffle=True)
model_best = Classifier().cuda()
loss = nn.CrossEntropyLoss() # 因为是分类任务,所以使用交叉熵损失
optimizer = torch.optim.Adam(model_best.parameters(), lr=0.001) # optimizer 使用 Adam
num_epoch = 30

for epoch in range(num_epoch):
epoch_start_time = time.time()
train_acc = 0.0
train_loss = 0.0

model_best.train()
for i, data in enumerate(train_val_loader):
optimizer.zero_grad()
train_pred = model_best(data[0].cuda())
batch_loss = loss(train_pred, data[1].cuda())
batch_loss.backward()
optimizer.step()

train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
train_loss += batch_loss.item()

#將結果 print 出來
print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f' % \
(epoch + 1, num_epoch, time.time()-epoch_start_time, \
train_acc/train_val_set.__len__(), train_loss/train_val_set.__len__()))
[001/030] 53.31 sec(s) Train Acc: 0.236688 Loss: 0.017507
[002/030] 53.80 sec(s) Train Acc: 0.348827 Loss: 0.014559
[003/030] 53.70 sec(s) Train Acc: 0.419073 Loss: 0.012875
[004/030] 53.53 sec(s) Train Acc: 0.477662 Loss: 0.011746
[005/030] 53.68 sec(s) Train Acc: 0.521585 Loss: 0.010800
[006/030] 53.66 sec(s) Train Acc: 0.557461 Loss: 0.009822
[007/030] 53.72 sec(s) Train Acc: 0.594690 Loss: 0.009144
[008/030] 53.69 sec(s) Train Acc: 0.630114 Loss: 0.008431
[009/030] 53.63 sec(s) Train Acc: 0.652978 Loss: 0.007837
[010/030] 53.60 sec(s) Train Acc: 0.671631 Loss: 0.007462
[011/030] 53.53 sec(s) Train Acc: 0.685545 Loss: 0.007101
[012/030] 53.64 sec(s) Train Acc: 0.708559 Loss: 0.006587
[013/030] 53.58 sec(s) Train Acc: 0.726760 Loss: 0.006227
[014/030] 53.56 sec(s) Train Acc: 0.742705 Loss: 0.005844
[015/030] 53.45 sec(s) Train Acc: 0.743156 Loss: 0.005692
[016/030] 53.41 sec(s) Train Acc: 0.771059 Loss: 0.005136
[017/030] 53.52 sec(s) Train Acc: 0.781363 Loss: 0.004918
[018/030] 53.51 sec(s) Train Acc: 0.799037 Loss: 0.004530
[019/030] 53.41 sec(s) Train Acc: 0.808439 Loss: 0.004312
[020/030] 53.43 sec(s) Train Acc: 0.814681 Loss: 0.004152
[021/030] 53.36 sec(s) Train Acc: 0.829798 Loss: 0.003810
[022/030] 53.45 sec(s) Train Acc: 0.836041 Loss: 0.003567
[023/030] 53.53 sec(s) Train Acc: 0.854392 Loss: 0.003179
[024/030] 53.56 sec(s) Train Acc: 0.866802 Loss: 0.002981
[025/030] 53.68 sec(s) Train Acc: 0.873872 Loss: 0.002810
[026/030] 53.65 sec(s) Train Acc: 0.874925 Loss: 0.002740
[027/030] 53.62 sec(s) Train Acc: 0.880415 Loss: 0.002611
[028/030] 53.68 sec(s) Train Acc: 0.901850 Loss: 0.002217
[029/030] 53.67 sec(s) Train Acc: 0.898240 Loss: 0.002253
[030/030] 53.72 sec(s) Train Acc: 0.912154 Loss: 0.001937

Testing

利用刚刚 train 好的 model 进行 prediction

test_set = ImgDataset(test_x, transform=test_transform)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)
model_best.eval()
prediction = []
with torch.no_grad():
for i, data in enumerate(test_loader):
test_pred = model_best(data.cuda())
test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
for y in test_label:
prediction.append(y)
#将结果写入csv档
with open("predict.csv", 'w') as f:
f.write('Id,Category\n')
for i, y in enumerate(prediction):
f.write('{},{}\n'.format(i, y))