💥1 概述
人工分类的时间和速度有很大的不确定性和不稳定性,若图像种类和数量都很多的情况下,采取人工分类的方法耗费人力和时间,不如利用计算机的处理速度和稳定性来代替人工人类。图像分类技术是计算机视觉任务的基础 [1] 。与深度学习相结合的图像分类技术,要经过图像预处理、特征提
取和分类这三步骤。首先用三种方法中的一种或几种对图像做一些预处理;然后根据三类特征用相应的算法进行提取;经过一系列的特征向量的变化,最后在分类时分二分类或多分类,输出一个值。
( 1 )图像的预处理 [2] 在有很重要的地位,因为图像的质量会影响了后面模型的需完成的任务,例如分类、识别或者分割等技术的实现。预处理主要包括是三个部分:
1. 灰度化,在 RGB 图像转为灰度化的图像时对应的值叫做灰度值,还可以叫强度值或亮度值,灰度值取值区间为[0,255],而在其的颜色空间之中, R 、 G 、 B 都为零时,是黑色;都为 100 时。是灰色,而为 255 时是白色,在这种 R 、 G 、B 的值都相等的时候,图像是在黑白灰三种颜色中过渡。灰度化方法一般会用:最大值法、平均值法、加权平均法等。
2. 几何变换也叫图像的空间变换,它对图像进行平移、镜像或转置等处理,使得在样本不足的情况下,用这种方法增加样本,来增加图像分类模型训练后的正确性,减少误差。然后用图像插值方法。线性插值方法主要分为两种:线性插值:最近邻插值、双线性插值、双三次插值。和非线性
插值:分为两类——基于小波系数、基于边缘信息(分为显示方法、隐式方法,其中隐式方法包括:NEDI 、 LMMSE 、SAI、 CGI )。
📚2 运行结果
部分代码:
function [gradients,loss] = modelGradients(dlnet,dlX,Y)
% A convolutional operation was done with the function forward
dlYPred = forward(dlnet,dlX);
% the result obtained was normalized with the function softmax as done in
% softmax layer
dlYPred = softmax(dlYPred);
% cross entropy loss was calculated with the function crossentropy
% if you would like the network to solve a regression problem, squared loss
% is used in many cases
loss = crossentropy(dlYPred,Y);
% the gradiant was calculated with the function dlgradient
gradients = dlgradient(loss,dlnet.Learnables);
end
%This function resizes the images to the proper size for the pre-trained
%network
function Iout = readAndPreproc(inFilename,imgSize)
% read the target image
I = imread(inFilename);
if size(I,3)==1
I=cat(3,I,I,I);
end
% resize into the input size for the pre-trained model
Iout = imresize(I,[imgSize(1:2)]);
end
function [XTrainX,YY,idxS]=MixUpPreProc(XTrain,YTrain,numMixUp)
% first, the composition ratio of each image was randomly determined
% for example, in the explanation above, the value of α, β and γ was
% decided
lambda=rand([numel(YTrain),numMixUp]);
lambda=lambda./sum(lambda,2);%the sum of lambda value accross the class should be 1
lambda=reshape(lambda,[1 1 1 numel(YTrain) numMixUp]);
idxS=[]; XTrainK=[]; YTrainK=[]; XTrainX=zeros([size(XTrain)]);
numClasses=numel(countcats(YTrain));classes = categories(YTrain);
% after this loop, idxS will be a vector with the size of (number of
% training image) * 1 * (number of mix up)
% number of mixup is, in many cases, 2, but you can specify as you want
% The value extracted from idxS(N,1,1:end) represents the index of training images to mix up
% this means, the images with the same class can be mixed up
% The images were mixed up with the weight of lamda
% The variable XTrainX is the images after mixed-up
for k=1:numMixUp
idxK=randperm(numel(YTrain));
idxS=cat(3,idxS,idxK);
XTrainK=cat(5,XTrainK,double(XTrain(:,:,:,idxK)));
YTrainK=cat(2,YTrainK,YTrain(idxK)); %YTrainK:(miniBatchSize)×(numMixUp)
XTrainX=XTrainX+double(XTrain(:,:,:,idxK)).*lambda(1,1,1,:,k);
end
% Next, the vector which corresponds to the label information was made
% if the classes in the task are dog, cat and bird and one image was
% synthesized using 50 % of dog and bird image,
% the label for the synthesized image should be [0.5 0 0.5]
% Howeve, in this loop, the weitht and the classes to pick were
% randomly collected, then the lables were prepared as follows:
lambda=squeeze(lambda);
Y = zeros(numClasses, numel(YTrain), numMixUp,'single');
for j=1:numMixUp
lambdaJ=lambda(:,j);
for c = 1:numClasses
Y(c,YTrain(idxS(1,:,j))==classes(c),j) = lambdaJ(find(YTrain(idxS(1,:,j))==classes(c)));
end
end
YY=sum(Y,3);
end
🎉3 参考文献
部分理论来源于网络,如有侵权请联系删除。
[1]张雪晴.基于CNN的图像分类[J].电子技术与软件工程,2022(07):182-185.
[2]何明智,朱华生,李永健,唐树银,孙占鑫.基于融合CNN和Transformer的图像分类模型[J].南昌工程学院学报,2022,41(04):52-57+78.