文章目录

  • 前言
  • 一、准备数据集
  • 二、处理数据集
  • 1.筛选出人物和摩托车的图片和图片信息文件
  • 2.生成TXT文件
  • 三.使用darknet训练yolo模型
  • 1.下载源码darknet
  • 2.下载yolov4-tiny训练权重
  • 3.修改文件参数
  • 参考资料



前言

浏览了一些网上资料自己学着创建yolov4-tiny模型,写下此文留作记录。


一、准备数据集

首先需要准备的是数据集。我用的是网上分享的数据集VOC2012数据集。链接如下

链接:https://pan.baidu.com/s/1rpGj5-iRcZlIjMdSvvQnSw

提取码:pbr7

VOC2012数据集分为20类,包括背景为21类,分别如下:

-Person: person

-Animal: bird, cat, cow, dog, horse, sheep

-Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train

-Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

将数据集下载过来,资源包内容如下

深度学习跑一个模型要多久 模型训练需要多少数据_深度学习跑一个模型要多久


Annotations文件夹内含图片的xml信息(图片的基本信息)

JPEGImages文件夹内含图片。

二、处理数据集

1.筛选出人物和摩托车的图片和图片信息文件

由于VOC数据集内有许多不需要的图片数据,我们需要选取出我需要的person和motorbike的数据。
那么在VOCdevkit文件夹处新建python文件SelectVOC2012Person.py

import os
import shutil
import bs4
 
ann_filepath = './VOC2012/Annotations/'
img_filepath = './VOC2012/JPEGImages/'
 
img_savepath = './VOCPersonbike/JPEGImages/'
ann_savepath = './VOCPersonbike/Annotations/'
if not os.path.exists(img_savepath):
    os.mkdir(img_savepath)
 
if not os.path.exists(ann_savepath):
    os.mkdir(ann_savepath)
names = locals()
classes = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle',
           'bus', 'car', 'cat', 'chair', 'cow', 'diningtable',
           'dog', 'horse', 'motorbike', 'pottedplant',
           'sheep', 'sofa', 'train', 'tvmonitor', 'person']
 
for file in os.listdir(ann_filepath):
    print(file)
 
    fp = open(ann_filepath + '/' + file)  # 打开Annotations文件
    ann_savefile = ann_savepath + file
    fp_w = open(ann_savefile, 'w')
    lines = fp.readlines()
 
    ind_start = []
    ind_end = []
    lines_id_start = lines[:]
 
    lines_id_end = lines[:]
 
    #     classes1 = '\t\t<name>bicycle</name>\n'
    #     classes2 = '\t\t<name>bus</name>\n'
    #     classes3 = '\t\t<name>car</name>\n'
    classes4 = '\t\t<name>motorbike</name>\n'
    classes5 = '\t\t<name>person</name>\n'
 
    # 在xml中找到object块,并将其记录下来
    while "\t<object>\n" in lines_id_start:
        a = lines_id_start.index("\t<object>\n")
        ind_start.append(a)  # ind_start是<object>的行数
        lines_id_start[a] = "delete"
 
    while "\t</object>\n" in lines_id_end:
        b = lines_id_end.index("\t</object>\n")
        ind_end.append(b)  # ind_end是</object>的行数
        lines_id_end[b] = "delete"
 
    # names中存放所有的object块
    i = 0
    for k in range(0, len(ind_start)):
        names['block%d' % k] = []
        for j in range(0, len(classes)):
            if classes[j] in lines[ind_start[i] + 1]:
                a = ind_start[i]
                for o in range(ind_end[i] - ind_start[i] + 1):
                    names['block%d' % k].append(lines[a + o])
                break
        i += 1
        # print(names['block%d' % k])
 
    # xml头
    string_start = lines[0:ind_start[0]]
 
    # xml尾
    if ((file[2:4] == '09') | (file[2:4] == '10') | (file[2:4] == '11')):
        string_end = lines[(len(lines) - 11):(len(lines))]
    else:
        string_end = [lines[len(lines) - 1]]
 
        # 在给定的类中搜索,若存在则,写入object块信息
    a = 0
    for k in range(0, len(ind_start)):
        #         if classes1 in names['block%d' % k]:
        #             a += 1
        #             string_start += names['block%d' % k]
        #         if classes2 in names['block%d' % k]:
        #             a += 1
        #             string_start += names['block%d' % k]
        #         if classes3 in names['block%d' % k]:
        #             a += 1
        #             string_start += names['block%d' % k]
         if classes4 in names['block%d' % k]:
            a += 1
            string_start += names['block%d' % k]
         if classes5 in names['block%d' % k]:
            a += 1
            string_start += names['block%d' % k]
 
    string_start += string_end
    # print(string_start)
    for c in range(0, len(string_start)):
        fp_w.write(string_start[c])
    fp_w.close()
    # 如果没有我们寻找的模块,则删除此xml,有的话拷贝图片
    if a == 0:
        os.remove(ann_savepath + file)
    else:
        name_img = img_filepath + os.path.splitext(file)[0] + ".jpg"
        shutil.copy(name_img, img_savepath)
    fp.close()

在VOCdevkits路径中新建文件夹VOCPersonbike。如图所示(请无视YoloPersonbike文件夹)。

深度学习跑一个模型要多久 模型训练需要多少数据_xml_02


然后在VOCPersonbike文件夹中新建Annotations、JPEGImage、ImageSets和labels文件夹

深度学习跑一个模型要多久 模型训练需要多少数据_xml_03


最后在VOCdevkits路径中运行SelectVOC2012Person.py

Python SelectVOC2012Person.py

如果该文件爆出错误缺少某些模块,比如 ModuleNotFoundError: No module named ‘bs4’

可以用python命令 :pip install bs4 来进行安装

SelectVOC2012Person.py运行成功的话会在VOCPersonbike文件内就已经筛选出人和摩托车的图片并保存在JPEGImage中。同样相对应的xml文件会存在Annotations中。

2.生成TXT文件

在VOCPersonbike文件夹中新建shengcheng.py和voc_label.py

深度学习跑一个模型要多久 模型训练需要多少数据_xml_04


shengcheng.py代码如下:

import os
import random
import sys
  
if len(sys.argv) < 2:
    print("no directory specified, please input target directory")
    exit()
  
root_path = sys.argv[1]
  
xmlfilepath = root_path + '/Annotations'
  
txtsavepath = root_path + '/ImageSets/Main'
  
if not os.path.exists(root_path):
    print("cannot find such directory: " + root_path)
    exit()
  
if not os.path.exists(txtsavepath):
    os.makedirs(txtsavepath)
  
trainval_percent = 0.9
train_percent = 0.8
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
  
print("train and val size:", tv)
print("train size:", tr)
  
ftrainval = open(txtsavepath + '/trainval.txt', 'w')
ftest = open(txtsavepath + '/test.txt', 'w')
ftrain = open(txtsavepath + '/train.txt', 'w')
fval = open(txtsavepath + '/val.txt', 'w')
  
for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)
  
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

voc_label.py.py代码如下:

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
from PIL import Image
  
sets=[('train'), ('val'),  ('test'), ('trainval')]
  
#classes=["person","staff","cat","dog","cow","sheep","pig","bird","bicycle","car", "motorbike","aeroplane","bus","train","truck","boat","traffic light","fire hydrant","stop sign","parking meter","bench","horse","elephant","bear", "zebra","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis", "snowboard","sportsball","kite","baseballbat","baseballglove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife","spoon","bowl","banana","apple", "sandwich","orange","broccoli","carrot","hot dog","pizza","donut","cake","chair","sofa", "pottedplant","bed","diningtable","toilet","tvmonitor","laptop","mouse","remote","keyboard", "cell phone","microwave","oven","toaster","sink","refrigerator","book","clock","vase", "scissors","teddy bear","hair drier","toothbrush"]
classes=["motorbike","person"]
  
def convert(size, box):
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 - 1
    y = (box[2] + box[3])/2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)
  
def convert_annotation( image_id):
    in_file = open('Annotations/%s.xml'%( image_id))
    out_file = open('labels/%s.txt'%( image_id), 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
        # 这里对不标准的xml文件(没有size字段)做了特殊处理,打开对应的图片,获取h, w
    if size == None:
        print('{}不存在size字段'.format(image_id))  
        img = Image.open('JPEGImages/' + image_id + '.jpg')
        w, h = img.size  #大小/尺寸
        print('{}.xml缺失size字段, 读取{}图片得到对应 w:{} h:{}'.format(image_id, image_id, w, h))      
    else:
        w = int(size.find('width').text)
        h = int(size.find('height').text)
  
    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
  
 
  
for image_set in sets:
    if not os.path.exists('labels/'):
        os.makedirs('labels/')
    image_ids = open('ImageSets/Main/%s.txt'%(image_set)).read().strip().split()
    list_file = open('%s.txt'%( image_set), 'w')
    for image_id in image_ids:
        #print(image_id)
        list_file.write('/home/zyg/traindata/VOCdevkit/VOCPersonbike/JPEGImages/%s.jpg\n'%(  image_id))
        convert_annotation( image_id)
    list_file.close()
     
print ('LABEL OVER')

然后在VOCPersonbike文件路径上运行shengcheng.py

python shengcheng.py /home/zyg/traindata/VOCdevkit/VOCPersonbike

(这个/home/zyg/traindata/VOCdevkit/VOCPersonbike是一个路径)

查看在/ImageSets/Main目录下是否生成trainval.txt、test.txt、train.txt、val.txt

接着在VOCPersonbike文件路径上运行voc_label.py

python voc_label.py

成功的话就可以在labels文件夹内看到图片对应的TXT文件

三.使用darknet训练yolo模型

1.下载源码darknet

从如下网址下载最新版的darknetAB源码:https://github.com/AlexeyAB/darknet

2.下载yolov4-tiny训练权重

下载 Yolov4-tiny训练权重:https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights

将下载好的权重放到darknet目录下即可;

3.修改文件参数

1、打开makefile文件,在编译之前可以修改可选参数以适应于自己的计算机:

我的电脑是公司电脑,没有加装GPU,所以打算用CPU进行训练。

(1)GPU=0,不采用GPU加速,不采用CUDA进行编译;

(2)CUDNN=0,不采用cuDNN v5-v7进行编译,进行加速训练;

(3)CUDNN_HALF=0,不在Titan V、Tesla V100、DGX-2及更高版本上使用;

(4)OPENCV=1,使用OpenCV,支持各个版本的OpenCV(4.x/3.x/2.4.x),可以检测来自网络摄像机或web-cams的视频文件与视频流。

上述这4个参数是主要修改的,其他参数使用得较少,可暂时不用修改。

深度学习跑一个模型要多久 模型训练需要多少数据_图像处理_05


2、在darknet目录下执行make进行编译生成darknet;

3、进入darknet/cfg目录下,复制yolov4-custom.cfg,名字改为yolov4-tiny-personbike.cfg,并打开该文件,进行下面的6处修改:

(1)yolov4-tiny-personbike.cfg文件第1-7行如下:
----------------------------------------------------------------
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=16
----------------------------------------------------------------
注意:由于是进行训练,这里不需要修改。训练过程中可能出现CUDA out of memory的提示,可将这里的subdivisions增大,如32或64,但是数值越大耗时越长,因此需要权衡一下;

(2)yolov4-tiny-personbike.cfg文件第8-9行将608修改为416:

----------------------------------------------------------------

width=416

height=416

----------------------------------------------------------------

注意:这里也可不改,如果原始的数值608可能会导致CUDA out of memory的提示,而且这里的数值必须是32的倍数,这里也是数值越大耗时越长;

(3)第20行的参数max_batches也要修改,原始值为500500,max_batches = classes*10000,但是max_batches不要低于训练的图片张数,因为发现训练20000张所花时间太长就将max_batches 改成了10000(作者推荐训练模型时还是要带GPU才有效率)。

因此max_batches = 10000;

(4)第22行的参数steps=8000,9000,这两个数值分别为max_batches的80%和90%;

深度学习跑一个模型要多久 模型训练需要多少数据_深度学习跑一个模型要多久_06


(5)继续修改yolov4-tiny-personbike.cfg文件,按Ctrl+F键,搜索“classes”,一共有3处,先定位到第一处,将classes=80改为classes=2,并将classes前面最近的filters修改为18,计算由来(classes+5)*3=21;

(6)继续修改yolov4-tiny-personbike.cfg文件,按照上面的步骤同样修改第二处和第三处的classes;

4、进入darknet/data文件夹下,创建名称为personbike.names的文件(参考该文件夹voc.names文件的写法),在personbike.names文件中添加类别名称,本次实验只需添加person和motorbike即可;

5、进入darknet/cfg文件夹下,创建名称为personbike.data的文件,在该文件中添加相关内容,一共5行,参考示例voc.data文件,类别改为1;

----------------------------------------------------------------

classes= 2

train =/home/zyg/traindata/VOCdevkit/VOCPersonbike/train.txt

valid = /home/zyg/traindata/VOCdevkit/VOCPersonbike/test.txt

names = /home/zyg/Desktop/darknet-master/data/personbike.names

backup = backup

----------------------------------------------------------------

(1)其中第二行和第三行分别为train.txt和test.txt所在的路径,第5行改为前面新建的personbike.names;

(2)这里的train.txt和test.txt前一篇博客中第1步生成的文件;

(3)第5行backup = backup不能写成其它路径,否则会报错;

深度学习跑一个模型要多久 模型训练需要多少数据_图像处理_07


5、进入darknet目录下,右键点击Open in Terminal,并输入以下指令开始训练:

./darknet detector train cfg/personbike.data cfg/yolov4-tiny-personbike.cfg yolov4-tiny.conv.29

6、训练的过程中,生成的权重文件会存放在/darknet/backup文件夹下,训练过程每隔一段时间会生成一个.weights文件;

深度学习跑一个模型要多久 模型训练需要多少数据_深度学习跑一个模型要多久_08


7、生成.weights文件后,便可以进行测试了(此时训练仍在继续,另外开一个终端进入darknet路径下)。也可以等待全部训练完成后再进行测试。测试指令如下./darknet detector test /cfg/personbike.data cfg/yolov4-tiny-personbike.cfg /backup/yolov4-tiny-personbike_final.weights 检测图片路径

深度学习跑一个模型要多久 模型训练需要多少数据_xml_09


至此,基本模型建立完成。