MMDetection V1.0版本发布以来,就获得很多用户的喜欢,发布以来,其中有不少有价值的建议,同时也有很多开发者贡献代码,在2020年5月6日,发布了MMDetection V2.0。

MMDetection中实现了RPN,Fast R-CNN,Faster R-CNN,Mask R-CNN等网络和框架。先简单介绍一下和 Detectron 的对比:

  • performance 稍高
  • 训练速度稍快
  • 所需显存稍小

但更重要的是,基于PyTorch和基于Caffe2的code相比,易用性是有代差的。成功安装 Detectron的时间,大概可以装好一打的mmdetection吧。

当然Detectron有些优势也很明显,作为第一个全面的detection codebase,加上FAIR的金字招牌,release的模型也比较全面。研究者也在努力扩充model zoo,奈何人力和算力还是有很大差距,所以还需要时间。

具体说说上面提到的三个方面吧。首先是performance ,由于PyTorch官方model zoo里面的ResNet结构和Detectron所用的ResNet有细微差别(mmdetection中可以通过backbone的style参数指定),导致模型收敛速度不一样,所以用两种结构都跑了实验,一般来说在1x的lr schedule下Detectron的会高,但2x的结果PyTorch的结构会比较高。

速度方面Mask R-CNN差距比较大,其余的很小。采用相同的setting,Detectron每个iteration需要0.89s,而mmdetection只需要0.69s。Fast R-CNN比较例外,比Detectron的速度稍慢。另外在自己的服务器上跑Detectron会比官方report的速度慢20%左右,猜测是FB的Big Basin服务器性能比研究者好?

显存方面优势比较明显,会小30%左右。但这个和框架有关,不完全是codebase优化的功劳。一个让研究者比较意外的结果是现在的codebase版本跑ResNet-50的Mask R-CNN,每张卡(12 G)可以放4张图,比研究者比赛时候小了不少。

MMDetection是商汤科技(2018 COCO 目标检测挑战赛冠军)和香港中文大学开源的一个基于Pytorch实现的深度学习目标检测工具箱。

新年2021年,香港中文大学多媒体实验室(MMLab)OpenMMLab 又研究并贡献新的平台工具,发布了一款一体化视频目标感知平台MMTracking。该框架基于PyTorch写成,支持单目标跟踪、多目标跟踪与视频目标检测,目前已开源。我们开始详细分下下。

  • 第一个统一的视频感知平台


  • 模块化设计


  • Simple, Fast and Strong

SimpleMMTracking与其他Open MMLab项目交互。它是建立在MMDetection上的,通过修改配置文件选择。




1、Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.7conda activate open-mmlab

2、Install PyTorch and torchvision following the official instructions, e.g.,


Note: Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the PyTorch website.

​E.g.1​​​ If you have CUDA 10.1 installed under ​​/usr/local/cuda​​ and would like to install PyTorch 1.5, you need to install the prebuilt PyTorch with CUDA 10.1.

conda install pytorch cudatoolkit=10.1

​E.g. 2​​​ If you have CUDA 9.2 installed under ​​/usr/local/cuda​​ and would like to install PyTorch 1.3.1., you need to install the prebuilt PyTorch with CUDA 9.2.

conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2

If you build PyTorch from source instead of installing the prebuilt pacakge, you can use more CUDA versions such as 9.0.

3、Install mmcv-full, we recommend you to install the pre-build package as below.

pip install mmcv-full -f

See here for different versions of MMCV compatible to different PyTorch and CUDA versions. Optionally you can choose to compile mmcv from source by the following command

git clonecdMMCV_WITH_OPS=1 pip install -e .
# package mmcv-full will be installed after this stepcd

Or directly run

pip install mmcv-full

4、Install MMDetection


Optionally, you can also build MMDetection from source in case you want to modify the code:

git clone mmdetectionpip
install -r requirements/build.txtpip install -v -e . # or "python develop"

5、Clone the MMTracking repository.

git clonecd

6、Install build requirements and then install MMTracking.

pip installpip install -v -e .  # or "python develop"


This section will show how to test existing models on supported datasets. The following testing environments are supported:

  • single GPU
  • single node multiple GPU
  • multiple nodes

During testing, different tasks share the same API and we only support ​​samples_per_gpu = 1​​.

You can use the following commands for testing:

# single-gpu testingpython tools/ ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}]
[--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]# multi-gpu testing./tools/ ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

Optional arguments:

  • ​CHECKPOINT_FILE​​: Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.
  • ​RESULT_FILE​​: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
  • ​EVAL_METRICS​​​: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., ​​bbox​​ is available for ImageNet VID, ​​track​​ is available for LaSOT, ​​bbox​​ and ​​track​​ are both suitable for MOT17.
  • ​--cfg-options​​: If specified, the key-value pair optional cfg will be merged into config file
  • ​--eval-options​​: If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation
  • ​--format-only​​: If specified, the results will be formated to the offical format.

