一、引言

       Tensorflow提供了一个非常黑箱化的目标检测框架,可以让你在不用敲代码的情况下,利用自己的训练数据训练出一个神经网络模型。框架的代码写的很整洁,但是对初学者并不友好,代码层层嵌套不易阅读(就比如我这样的,汗!!!!)。我个人认为,此框架对于我们了解神经网络大体是如何训练的会有一个很好的帮助。接下来我将以Faster R-CNN为例,讲述如何使用这个框架来训练神经网络模型。

二、下载预训练模型

       打开object detection api 地址:https://github.com/tensorflow/models/tree/master/research/object_detection,在reademe中找到Tensorflow detection model zoo,进去之后可以看到有基于COCO数据集、Kitti数据集、Open Images数据集等训练出来的预训练模型,我们选择faster_rcnn_inception_resnet_v2_atrous_coco,然后下载到本地并解压。

TensorFlow框架能做目标检测吗 tensorflow目标检测模型_TensorFlow框架能做目标检测吗

三、安装Object Detection API

Tensorflow Object Detection API 依赖以下库:

  • Protobuf 3.0.0
  • Pillow 1.0
  • lxml
  • Matplotlib

1.首先安装tensorflow1.12.0

    详细配置过程,可以参考我的博客:。

2.进入该页面,将整个tensorflow object detection api下载下来

   下载并解压后的文件名为models-master,将名称更改为tensorflow。

3.在anaconda下安装相关依赖库:

sudo apt-get install protobuf-compiler

pip install --user pillow
pip install --user lxml
pip install --user matplotlib

4.protobuf编译

       由于Tensorflow Object Detection API 使用 Protobuf去配置模型和训练参数,因此在使用框架之前必须先编译protobuf,输入如下命令即可:

# From tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.

注:使用以上命令有可能安装的是3.5版本的protobuf,自己可以检查一下,如果是,请使用以下方式手动降级:

  • 首先下载3.0版本的protobuf,并解压
# From tensorflow/models/research/
wget -O protobuf.zip https://github.com/google/protobuf/releases/download/v3.0.0/protoc-3.0.0-linux-x86_64.zip
unzip protobuf.zip
  • 重新进行编译,不过命令有所改变:
# From tensorflow/models/research/
./bin/protoc object_detection/protos/*.proto --python_out=.

5.配置pth文件

#from ~/anaconda3/envs/tensorflow/lib/python3.6/site-packages
touch my_tensorflow_mode.pth
gedit my_tensorflow_mode.pth

#将research和slim所在目录的路径添加为python环境变量:
home/dulingwen/tensorflow/models/research
home/dulingwen/tensorflow/models/research/slim

6.测试是否配置成功:

运行如下命令:

#from tensorflow/models/research/object_detection/builders/
python model_builder_test.py

四、准备训练数据

将样本数据的放置按照下述方式来安排(此处参照的是VOC数据集)。

#from tensorflow/models/research/object_detection/legacy/

-data
 --VOC2012
   ---pascal_label_map.pbtxt #类型列表
   ---Annotations  #标签文件(xml)
   ---JPEGImages   #图像文件(jpg)

   ---train
   ---val
   ---aug_xml
   ---ImageSets
     ----Main
       -----train.txt #训练图像的名称列表
       -----val.txt   #验证图像的名称列表

五、将数据集转换为TfRecord文件

具体过程参博客:

六、配置训练文件

进入 tensorflow/models/research/object_detection/samples/config文件夹里,打开faster_rcnn_inception_resnet_v2_atrous_coco.config

找到以下几个内容,并进行修改,其他内容暂时可以不用修改:

num_classes: 修改为你自己任务的类别数
batch size:1(一般GPU显存较小的,尽量设置为1,避免无法训练的问题)

fine_tune_checkpoint: "你的路径/model.ckpt" #指定“训练模型的检查点文件”

train_input_reader: {
  tf_record_input_reader {
    input_path: "你的路径/train.record"
  }
  label_map_path: "你的路径/pascal_label_map.pbtxt"
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "你的路径/val.record"
  }
  label_map_path: "你的路径/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

七、训练模型

使用如下命令执行训练:

#from tensorflow/models/research/object_detection/legacy/
python train.py --logtostderr --train_dir=训练模型的输出路径 ----pipeline_config_path=你的路径/faster_rcnn_inception_resnet_v2_atrous_coco.config

训练过程中会一直计算Loss值,可以在屏幕上看到,也可以通过tensorboard可视化。打开tensorboard的命令如下:

tensorboard --logdir={train_path} #train_path就是训练模型的输出目录

八、导出模型(pb文件)

在训练过程中,每隔特定的步数,会保存一次相应的模型参数文件,在训练模型的输出文件夹下,有一个checkpoint文件,打开可以看到最近保存的几次检查点文件的名称及其路經。使用如下命令即可导出模型。

#From tensorflow/modles/research/object_detection/

python export_inference_graph.py 
--input_type image_tensor 
--pipeline_config_path 你的路經/faster_rcnn_inception_resnet_v2_atrous_coco.config
--trained_checkpoint_prefix 你的路經/model.ckpt-118577 #选择最近的一个或确认收敛到最优的一个
--output_directory 你的路經/my_model/ #模型的输出路經

九、单机多卡训练

(1)假设机器安装显卡数量:8,首先在配置文件中设置batch size为:8,然后输入如下命令:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py --logtostderr --train_dir=训练模型的输出路径 ----pipeline_config_path=你的路径/faster_rcnn_inception_resnet_v2_atrous_coco.config --num_clones=8 --ps_tasks=1

其中,--CUDA_VISIBLE_DEVICES:指定要使用的GPU编号列表,多个编号之间用英文逗号分隔开。--num_clones : 指定GPU卡的数量,--ps_tasks : 指定参数伺服器的数量

(2)指定GPU内存使用量:

待完善

十、多机多卡训练

待完善

十一、测试模型

      在research/object_detection/legacy/文件夹下创建一个空的py文件,可以命名为" predict.py ",这个python代码的功能是使用冻结的模型对图像进行预测。

代码内容如下:

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
import cv2
from skimage import io

from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
from object_detection.utils import ops as utils_ops


def get_suffixfile(path, suffix):
    assert(os.path.exists(path)),'文件路經不存在!'

    files = os.listdir(path)
    filename = []
    for file in files:
        Suffix = os.path.splitext(file)[1]  # 获取文件名的后缀,如'.jpg','.xml','.txt'等.
        if Suffix == '.' + suffix:
            filename.append(file)
    if filename == []:
        print('指定目录下未发现相关文件')
        return None
    return filename



def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)


def run_inference_for_single_image(image, sess):
#     with graph.as_default():
#         with tf.Session() as sess:
    ops = tf.get_default_graph().get_operations()
    all_tensor_names = {
        output.name for op in ops for output in op.outputs}
    tensor_dict = {}
    for key in [
        'num_detections', 'detection_boxes', 'detection_scores',
        'detection_classes', 'detection_masks'
    ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
            tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                tensor_name)
    if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(
            tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(
            tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to
        # image coordinates and fit the image size.
        real_num_detection = tf.cast(
            tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [
                                   real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [
                                   real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
    image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

    # Run inference
    output_dict = sess.run(tensor_dict,
                           feed_dict={image_tensor: np.expand_dims(image, 0)})
    output_dict['num_detections'] = int(
        output_dict['num_detections'][0])
    output_dict['detection_classes'] = output_dict[
        'detection_classes'][0].astype(np.uint8)
    output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
    output_dict['detection_scores'] = output_dict['detection_scores'][0]
    if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
    return output_dict


if __name__ == '__main__':
    if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
        raise ImportError(
            'Please upgrade your TensorFlow installation to v1.9.* or later!')
    MODEL_NAME = '你的路經/my_model'
    PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'
    PATH_TO_LABELS = '你的路經/pascal_label_map.pbtxt'
    filepath = 'path to test image'
    out_path = 'path to ouput image directory'
    if os.path.exists(out_path):
        os.makedirs(out_path)

    detection_graph = tf.Graph()
    with detection_graph.as_default():
        od_graph_def = tf.GraphDef()
        with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
            serialized_graph = fid.read()
            od_graph_def.ParseFromString(serialized_graph)
            tf.import_graph_def(od_graph_def, name='')

    category_index = label_map_util.create_category_index_from_labelmap(
        PATH_TO_LABELS, use_display_name=True)

    with detection_graph.as_default():
        with tf.Session() as sess:
            TEST_IMAGE_FILES = get_suffixfile(filepath, 'jpg')
            for imgfile in TEST_IMAGE_FILES:
                print(filepath + '/' + imgfile)
                image_file = filepath + '/' + imgfile
                image = io.imread(image_file)
                image_np = image
                image_np_expanded = np.expand_dims(image_np, axis=0)
                output_dict = run_inference_for_single_image(image_np, sess)
                # Visualization of the results of a detection.
                vis_util.visualize_boxes_and_labels_on_image_array(
                    image_np,
                    output_dict['detection_boxes'],
                    output_dict['detection_classes'],
                    output_dict['detection_scores'],
                    category_index,
                    instance_masks=output_dict.get('detection_masks'),
                    use_normalized_coordinates=True,
                    line_thickness=8)

                jpg_name = os.path.basename(image_file)
                new_img_file = os.path.join(out_path, jpg_name)
                image = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
                cv2.imwrite(new_img_file, image)

在legacy文件夹下打开终端,激活tensorflow环境,输入下列命令运行程序:

python predict.py

根据代码,运行结果会输出到predict.py中设定的输出文件夹out_path = { path to ouput image directory }下。