开始
创建工程代码目录
创建darknet.py,darknet是YOLO的基础结构框架,这个文件包含创建YOLO网络的代码,用util.py文件包含的多种有用的函数代码对darknet.py提供支持,将这两个文件放到目录中去
配置文件
官方代码(c语言)使用配置文件来建立网络,cfg文件描述了网络布局,块与块的连接。
用以下命令下载网络配置信息
mkdir cfg
cd cfg
wget https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg
[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
我们看到4个块,三个卷积层和一个短连接层,短连接层用来跳过连接,YOLO中使用了五种层:
卷积层:
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
短接层:
[shortcut]
from=-3
activation=linear
from-3意思是短接层的输出是从前面几层的feature map和短接层的第三层的反向传播获得的
上采样层:
[upsample]
stride=2
步长为2的上采样层
路线层:
[route]
layers = -4
[route]
layers = -1, 61
可以有一个或者两个值
当有一个值的时候,输出是这个值所索引的那个层的featuremaps
有两个值时返回,比如-1,61,会输出-1和64之间的级数深度维度
YOLO层:
[yolo]
mask = 0,1,2
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=80
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1
anchors描述了9个锚点,mask值为012意思是第一二三个锚点被使用了,每个单元预测了3个盒子
我们检测层有三个大小,总共9个锚点
网络
[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16
width= 320
height = 320
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
网络不是层是因为它只是描述了网络输入训练参数的信息而不用在YOLO的前向传播,但是它提供给我们输入大小信息我们可以调整前向传播的锚点
解析配置文件
在开始前,在darknet.py开头包含必要的模块
from __future__ import division
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import numpy as np
定义parse_cfg函数,将配置文件的路径作为输入
def parse_cfg(cfgfile):
"""
Takes a configuration file
Returns a list of blocks. Each blocks describes a block in the neural
network to be built. Block is represented as a dictionary in the list
"""
保存每个块为文件夹
我们从保存cfg文件中的目录开始,下面代码显示了列表的进程
file = open(cfgfile, 'r')
lines = file.read().split('\n') # store the lines in a list
lines = [x for x in lines if len(x) > 0] # get read of the empty lines
lines = [x for x in lines if x[0] != '#'] # get rid of comments
lines = [x.rstrip().lstrip() for x in lines] # get rid of fringe whitespaces
循环走完resultant list来获得块
block = {}
blocks = []
for line in lines:
if line[0] == "[": # This marks the start of a new block
if len(block) != 0: # If block is not empty, implies it is storing values of previous block.
blocks.append(block) # add it the blocks list
block = {} # re-init the block
block["type"] = line[1:-1].rstrip()
else:
key,value = line.split("=")
block[key.rstrip()] = value.lstrip()
blocks.append(block)
return blocks
创建块
我们使用parse_cfg列表pytorch模块来构造配置文件中的块
上面提到我们用物种层,pytorch给我们提供了已经建造好的卷积层和上采样层,我们必须通过扩展nn.Module类来写我们自己的模块
create_modules函数通过parse_cfg函数返回得到blocks
def create_modules(blocks):
net_info = blocks[0] #Captures the information about the input and pre-processing
module_list = nn.ModuleList()
prev_filters = 3
output_filters = []
在我们重写blocks之前,定义了一个net_info变量来储存网络信息
nn.ModulList
函数会返回nn.ModuleList.这个类是包含nn.Module中对象的列表,但是,当我们将nn.ModuleList添加为nn.Module的成员时候,nn.Module中的所有参数在nn.ModuleList中的对象被作为nn.Module的对象中的参数被添加,就像我们把nn.Module添加成为成员一样
当我们定义一个新的卷积层,我们必须定义卷积核的大小,卷积核的高和宽由cfg文件给出,核的深度是过滤器的数量,这说明我们需要记录过滤器的数量在卷积层被应用后,我们用变量prev_filter来完成这个,我们初始化为3,因为图像有RGB三个通道
想法是参数化块的列表,为每个块生成一个pytorch模块
for index, x in enumerate(blocks[1:]):
module = nn.Sequential()
#check the type of block
#create a new module for the block
#append to module_list
nn.Sequential类被用来顺序执行nn.Module对象,如果你看cfg文件,你会意识到一个块可能包含超过一层,例如,一个卷积类型的块除了有一个带有ReLU激活层的批量规范层,还有一个卷积层。我们用nn.Sequential将这些层和它的add_module功能线性连接,如下是我们创建的卷积上采样层
if (x["type"] == "convolutional"):
#Get the info about the layer
activation = x["activation"]
try:
batch_normalize = int(x["batch_normalize"])
bias = False
except:
batch_normalize = 0
bias = True
filters= int(x["filters"])
padding = int(x["pad"])
kernel_size = int(x["size"])
stride = int(x["stride"])
if padding:
pad = (kernel_size - 1) // 2
else:
pad = 0
#Add the convolutional layer
conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias = bias)
module.add_module("conv_{0}".format(index), conv)
#Add the Batch Norm Layer
if batch_normalize:
bn = nn.BatchNorm2d(filters)
module.add_module("batch_norm_{0}".format(index), bn)
#Check the activation.
#It is either Linear or a Leaky ReLU for YOLO
if activation == "leaky":
activn = nn.LeakyReLU(0.1, inplace = True)
module.add_module("leaky_{0}".format(index), activn)
#If it's an upsampling layer
#We use Bilinear2dUpsampling
elif (x["type"] == "upsample"):
stride = int(x["stride"])
upsample = nn.Upsample(scale_factor = 2, mode = "bilinear")
module.add_module("upsample_{}".format(index), upsample)
短接层
下面是是我们的短接层代码
#If it is a route layer
elif (x["type"] == "route"):
x["layers"] = x["layers"].split(',')
#Start of a route
start = int(x["layers"][0])
#end, if there exists one.
try:
end = int(x["layers"][1])
except:
end = 0
#Positive anotation
if start > 0:
start = start - index
if end > 0:
end = end - index
route = EmptyLayer()
module.add_module("route_{0}".format(index), route)
if end < 0:
filters = output_filters[index + start] + output_filters[index + end]
else:
filters= output_filters[index + start]
#shortcut corresponds to skip connection
elif x["type"] == "shortcut":
shortcut = EmptyLayer()
module.add_module("shortcut_{}".format(index), shortcut)
创建路线层的代码需要说一下,首先,我们解压层贡献的权值,把它加入激活器,保存在列表中。
然后我们得到了一个名为EmptyLayer的层
route = EmptyLayer()
class EmptyLayer(nn.Module):
def __init__(self):
super(EmptyLayer, self).__init__()
YOLO 层
最后我们给出YOLO层的代码
#Yolo is the detection layer
elif x["type"] == "yolo":
mask = x["mask"].split(",")
mask = [int(x) for x in mask]
anchors = x["anchors"].split(",")
anchors = [int(a) for a in anchors]
anchors = [(anchors[i], anchors[i+1]) for i in range(0, len(anchors),2)]
anchors = [anchors[i] for i in mask]
detection = DetectionLayer(anchors)
module.add_module("Detection_{}".format(index), detection)
YOLO检测层,我们定义了新层检测层用来检测锚框
检测层这样定义
class DetectionLayer(nn.Module):
def __init__(self, anchors):
super(DetectionLayer, self).__init__()
self.anchors = anchors
在循环最后,我们做了一些记录
module_list.append(module)
prev_filters = filters
output_filters.append(filters)
这个包含了循环的主题,在create_modules函数的结尾,我们返回了包含着net_info和module_list的数组
return (net_info, module_list)
检测代码
可以在darknet.py最后加上这些话检测代码并且运行文件
blocks = parse_cfg("cfg/yolov3.cfg")
print(create_modules(blocks))
可以看到很长的列表,包含106个项目。
类似于如下代码
(9): Sequential(
(conv_9): Conv2d (128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(batch_norm_9): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(leaky_9): LeakyReLU(0.1, inplace)
)
(10): Sequential(
(conv_10): Conv2d (64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(batch_norm_10): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(leaky_10): LeakyReLU(0.1, inplace)
)
(11): Sequential(
(shortcut_11): EmptyLayer(
)