yolov5如何使用gpu训练

转载

mob64ca140d61c6 2025-08-26 16:16:05

文章标签 yolov5如何使用gpu训练深度学习 P4 数据集 github 文章分类 游戏开发

本次的主题是用传统的数据集尝试跑通yolov5训练脚本，熟悉训练时需要注意的参数和事项

1.创建 Dataset.yaml

data/coco128.yaml is a small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation in this example. coco128.yaml defines 1) a path to a directory of training images (or path to a *.txt file with a list of training images), 2) the same for our validation images, 3) the number of classes, 4) a list of class names:

coco128是COCO 训练集2017中的128张图片，就是为了做演示用的。yaml文件包括训练图片和TXT的路径，其中TXT是训练集的图像列表，这128张图像也作为验证集（就是为了熟悉怎么训练yolo，无所谓的），训练类别的数量和训练类别的名称

点击下载coco128的地址

# download command/URL (optional)
download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco128/images/train2017/
val: ../coco128/images/train2017/

# number of classes
nc: 80

# class names
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 
        'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 
        'teddy bear', 'hair drier', 'toothbrush']

2.创建label

先看看下载好的COCO128的文件结构，这里其实就是要制作自己数据集注意的地方

./
├── annotations
├── images
│   └── train2017
│       ├── 000000000009.jpg
│       ├── 000000000025.jpg
│       ├── 000000000030.jpg
│       ├── 000000000034.jpg
│     。。。。中间省略
│       ├── 000000000643.jpg
│       └── 000000000650.jpg
├── labels
│   └── train2017
│       ├── 000000000009.txt
│       ├── 000000000025.txt
│       ├── 000000000030.txt
│       ├── 000000000034.txt
│     。。。。中间省略
│       ├── 000000000643.txt
│       └── 000000000650.txt
├── LICENSE
└── README.txt

5 directories, 258 files

txt的内容

yolov5如何使用gpu训练_github

文件的格式：

一行是一个目标物体
每行的顺序是类别，x中心，y中心，宽，高
所有的坐标值都是归一化之后的，归一化方式就是x/w,y/h
类别的起始坐标从0开始

yolov5如何使用gpu训练_深度学习_02

文件的路径格式要对应上，比如：

dataset/images/train2017/000000109622.jpg # image dataset/labels/train2017/000000109622.txt # label

3.组织文件夹结构

Organize your train and val images and labels according to the example below. Note /coco128 should be next to the /yolov5 directory. Make sure coco128/labels folder is next to coco128/images folder.

保证yolov5文件夹和coco128是同级

保证coco128下的labels和images是同级

4.选择模型

从./models下选择模型文件，这里作为演示选择了yolo5s.yaml

yolov5如何使用gpu训练_数据集_03

如果自己的类别不是80，那就改下数量，改成自己要训练的类别数目，具体要改的参数如下：

# parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, BottleneckCSP, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, BottleneckCSP, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, BottleneckCSP, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, BottleneckCSP, [512, False]],  # 20  (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, BottleneckCSP, [1024, False]],  # 23  (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

5.开始训练

改完models/yolo5s.yaml之后，就是需要用脚本训练模型了

需指定两个文件，一个是模型的models/yolo5s.yaml，一个是数据集的 data/coco128.yaml，如果需要加载预训练模型，指定weights 为yolov5s.pt

yolov5如何使用gpu训练_深度学习_04

# Train YOLOv5s on coco128 for 5 epochs $ python train.py --img 640 --batch 16 --epochs 5 --data ./data/coco128.yaml --cfg ./models/yolov5s.yaml --weights ./weights/yolov5s.pt

训练过程如下：自己激活下之前配置的conda环境

yolov5如何使用gpu训练_yolov5如何使用gpu训练_05

训练5个epoch就结束了

yolov5如何使用gpu训练_github_06

具体的训练过程，等真正训练的时候再做分析，这里主要走通模型训练过程

6.可视化

刚才的过程当中生成了一些图片

yolov5如何使用gpu训练_yolov5如何使用gpu训练_07

打开下test_batch0_gt.jpg

yolov5如何使用gpu训练_数据集_08

验证下标签是否正确，如果数据集自己已经验证过了，这一步可以不看，本来训练模型之前就是要清洗数据的

Training losses and performance metrics are saved to Tensorboard and also to a runs/exp0/results.txt logfile. results.txt is plotted as results.png after training completes. Partially completed results.txt files can be plotted with from utils.utils import plot_results; plot_results(). Here we show YOLOv5s trained on coco128 to 300 epochs, starting from scratch (blue), and from pretrained yolov5s.pt (orange).

还有训练过程的数据，runs/exp0/results.txt可以进行可视化，这里演示下怎么绘制

在runs/exp0/下新建Python文件，也可以切换到该目录下，Python执行

输入两行，这个地方和官网给的不一致，问题参考https://github.com/ultralytics/yolov5/issues/890：utils/utils.py has been renamed to utils/general.py

from utils.general import plot_resultsplot_results()

输出结果如下，这里采用的是训练模型，所以也看不到收敛的过程，就是意思一下

yolov5如何使用gpu训练_P4_09